The LXC controller code currently directly invokes the
libvirt main loop code. The problem is that this misses
the cleanup of virNetServerClient connections that
virNetServerRun takes care of.
The result is that when libvirtd is stopped, the
libvirt_lxc controller process gets stuck in a I/O loop.
When libvirtd is then started again, it fails to connect
to the controller and thus kills off the entire domain.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Regression introduced by commit 258e06c85b, "ret" could be set to 1
or 0 by virStorageBackendFileSystemIsMounted before goto cleanup.
This could mislead the callers (up to the public API
virStoragePoolDestroy) to return success even the underlying umount
command fails.
I have been testing libvirt v1.0.0 for deployment within my
organization, and in the process discovered what appears to be a bug
that breaks virsh attach-device, when attaching an RBD volume to an
instance. First, here is the error presented, with v1.0.0 (this worked
in v0.10.2):
[root@host ~]# virsh attach-device W5APQ8 G84VV1.xml
error: Failed to attach device from G84VV1.xml
error: cannot open file 'dc3-1-test/G84VV1': No such file or directory
Using git bisect, I narrowed the problem down to this as the first
commit to break this setup:
4d34c92947 is the first bad commit
Commit a4c19459aa only added the
QEMU capability flag, command line option and added the boot element
for redirdev's in the XML schema.
This patch adds support for parsing and writing the XML with redirdevs
with the boot flag. It also ignores unknown XML elements in redirdev
instead of failing with:
"error: An error occurred, but the cause is unknown"
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=805414
Upcoming patches for revert-and-clone branching of snapshots need
to be able to copy a domain definition; make this step reusable.
* src/conf/domain_conf.h (virDomainDefCopy): New prototype.
* src/conf/domain_conf.c (virDomainObjCopyPersistentDef): Split...
(virDomainDefCopy): ...into new function.
(virDomainObjSetDefTransient): Use it.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/qemu/qemu_driver.c (qemuDomainRevertToSnapshot): Use it.
Relatively straight-forward. And since qemu was already using
VIR_DOMAIN_SNAPSHOT_FILTERS_ALL, with 6 different APIs all calling
into this common code, I've instantly added all 5 flags to 6 APIs.
* src/conf/snapshot_conf.h (VIR_DOMAIN_SNAPSHOT_FILTERS_ALL):
Enable new filters.
* src/conf/snapshot_conf.c (virDomainSnapshotObjListGetNames):
Prep the new flags.
(virDomainSnapshotObjListCopyNames): Actually do the filtering.
As we enable more modes of snapshot creation, it becomes more important
to be able to quickly filter based on snapshot properties. This patch
introduces new filter flags; subsequent patches will introduce virsh
back-compat filtering, as well as actual libvirt filtering.
* include/libvirt/libvirt.h.in (virDomainSnapshotListFlags): Add
five new flags in two new groups.
* src/libvirt.c (virDomainSnapshotNum, virDomainSnapshotListNames)
(virDomainListAllSnapshots, virDomainSnapshotNumChildren)
(virDomainSnapshotListChildrenNames)
(virDomainSnapshotListAllChildren): Document them.
* src/conf/snapshot_conf.h (VIR_DOMAIN_SNAPSHOT_FILTERS_STATUS)
(VIR_DOMAIN_SNAPSHOT_FILTERS_LOCATION): Add new convenience filter
collection macros.
* tools/virsh-snapshot.c (cmdSnapshotList): Add 5 new flags.
* tools/virsh.pod (snapshot-list): Document them.
This resolves:
https://bugzilla.redhat.com/show_bug.cgi?id=873134
The reported problem is that an attempt to restore a saved domain that
was configured with <currentMemory> and <memory> set to some (same for
both) number that's not a multiple of 4096KiB results in an error like
this:
error: Failed to start domain libvirt_test_api
error: XML error: current memory '4001792k' exceeds maximum '4000768k'
(in this case, currentMemory was set to 4000000KiB).
The reason for this failure is:
1) a saved image contains the "live xml" of the domain at the time of
the save.
2) the live xml of a running domain gets its currentMemory
(a.k.a. cur_balloon) directly from the qemu monitor rather than from
the configuration of the domain.
3) the value reported by qemu is (sometimes) not exactly what was
originally given to qemu when the domain was started, but is rounded
up to [some indeterminate granularity] - in some versions of qemu that
granularity is apparently 1MiB, and in others it is 4MiB.
4) When the XML is parsed to setup the state of the restored domain,
the XML parser for <currentMemory> compares it to <memory> (which is
the maximum allowed memory size for the domain) and if <currentMemory>
is greater than the next 1024KiB boundary above <memory>, it spits out
an error and fails.
For example (from the BZ) if you start qemu on RHEL6 with both
<currentMemory> and <memory> of 4000000 (this number is in KiB),
libvirt's dominfo or dumpxml will report "4001792" back (rounded up to
next 4MiB) for 10-20 seconds after the start, then revert to reporting
"4000000". On Fedora 16 (which uses qemu-1.0), it will instead report
"4000768" (rounded up to next 1MiB). On Fedora 17 (qemu-1.2), it seems
to always report "4000000". ("4000000" is of course okay, and
"4000768" is also okay since that's the next 1024KiB boundary above
"4000000" and the parser was already allowing for that. But "4001792
is *not* okay and produces the error message.)
This patch solves the problem by changing the allowed "fudge factor"
when parsing from 1024KiB to 4096KiB to match the maximum up-rounding
that could be done in qemu.
(I had earlier thought to fix this by up-rounding <memory> in the
dumpxml that's put into the saved image, but that wouldn't have fixed
the case where the save image was produced by an "unfixed"
libvirtd.)
Prior to this patch, 'virsh nodecpumap' on older kernels reported:
error: Unable to get cpu map
error: out of memory
* src/nodeinfo.c (linuxParseCPUmax): Don't overwrite error.
(nodeGetCPUBitmap): Provide backup implementation.
On RHEL 5, I was getting a segfault trying to start libvirtd,
because we were failing virNodeParseSocket but not checking
for errors, and then calling CPU_SET(-1, &sock_map) as a result.
But if you don't have a topology/physical_package_id file,
then you can just assume that the cpu belongs to socket 0.
* src/nodeinfo.c (virNodeGetCpuValue): Change bool into
default_value.
(virNodeParseSocket): Allow for default value when file is missing,
different from fatal error on reading file.
(virNodeParseNode): Update call sites to fail on error.
For disk snapshots, the user could request an external snapshot
but not supply a filename; later on, we would check this condition
and generate a suitable name if possible, or gracefully error out
when not possible (such as when the original file was a block
device). But unless we come up with a suitable way to generate
external memory file names, we have no later code point that was
checking for NULL, so we should forbid this up front.
* src/conf/snapshot_conf.c (virDomainSnapshotDefParseString):
Avoid NULL deref, since we don't generate names yet.
It may take some time for sanlock to add a lockspace. And if user
restart libvirtd service meanwhile, the fresh daemon can fail adding
the same lockspace with EINPROGRESS. Recent sanlock has
sanlock_inq_lockspace() function which should block until lockspace
changes state. If we are building against older sanlock we should
retry a few times before claiming an error. This issue can be easily
reproduced:
for i in {1..1000} ; do echo $i; service libvirtd restart; sleep 2; done
20
Stopping libvirtd daemon: [FAILED]
Starting libvirtd daemon: [ OK ]
21
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]
22
Stopping libvirtd daemon: [ OK ]
Starting libvirtd daemon: [ OK ]
error : virLockManagerSanlockSetupLockspace:334 : Unable to add
lockspace /var/lib/libvirt/sanlock/__LIBVIRT__DISKS__: Operation now in
progress
Since /sys/devices/system/cpu/present is not available on
older kernels like on RHEL 5.x nodeGetCPUCount will
fail there. The fallback implemented is to scan for
/sys/devices/system/cpu/cpuNN entries.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
This simplifies the top-level code, at the cost of using a little more
stack space. The primary benefit is being able to send more fields
without knowing in advance how many of them, and of which types, these
fields will be, and without having to individually add buffer variables.
The code imposes an upper limit on the total number of iovs/buffers
used, and fields that wouldn't fit are silently dropped. This is not
significant in this patch, but will affect the following one.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... and update all users. No change in functionality, the parameter
will be used later.
The metadata representation is as minimal as possible, but requires
the caller to allocate an array on stack explicitly.
The alternative of using varargs in the virLogMessage() callers:
* Would not allow the caller to optionally omit some metadata elements,
except by having two calls to virLogMessage.
* Would not be as type-safe (e.g. using int vs. size_t), and the compiler
wouldn't be able to do type checking
* Depending on parameter order:
a) virLogMessage(..., message format, message params...,
metadata..., NULL)
can not be portably implemented (parse_printf_format() is a glibc
function)
b) virLogMessage(..., metadata..., NULL,
message format, message params...)
would prevent usage of ATTRIBUTE_FMT_PRINTF and the associated
compiler checking.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The "restart" function for locks allocates a new array according to
and pre-sets its length, then reads the owner pids from a JSON
document in a loop. Rather than adding each owner at a different
index, though, it repeatedly overwrites the last element of the array
with all the owners.
This patch adds a helper to determine if snapshots are external and uses
the helper to fix detection of those in snapshot deletion code.
Snapshots are external if they have an external memory image or if the
disk locations are external. As mixed snapshots are forbidden for now
we need to check just one disk to know.
Lately there were a few reports of the output of the virsh nodeinfo
command being inaccurate. This patch tries to avoid that by checking if
the topology actually makes sense. If it doesn't we then report a
synthetic topology that indicates to the user that the host capabilities
should be checked for the actual topology.
Currently, if user calls virDomainAbortJob we just issue
'migrate_cancel' and hope for the best. However, if user calls
the API in wrong phase when migration hasn't been started yet
(perform phase) the cancel request is just ignored. With this
patch, the request is remembered and as soon as perform phase
starts, migration is cancelled.
For S390, the default console target type cannot be of type 'serial'.
It is necessary to at least interpret the 'arch' attribute
value of the os/type element to produce the correct default type.
Therefore we need to extend the signature of defaultConsoleTargetType
to account for architecture. As a consequence all the drivers
supporting this capability function must be updated.
Despite the amount of changed files, the only change in behavior is
that for S390 the default console target type will be 'virtio'.
N.B.: A more future-proof approach could be to to use hypervisor
specific capabilities to determine the best possible console type.
For instance one could add an opaque private data pointer to the
virCaps structure (in case of QEMU to hold capsCache) which could
then be passed to the defaultConsoleTargetType callback to determine
the console target type.
Seems to be however a bit overengineered for the use case...
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
When the libvirt daemon is restarted it tries to reconnect to running
qemu domains. Since commit d38897a5d4 the
re-connection code runs in separate threads. In the original
implementation the maximum of domain ID's (that is used as an
initializer for numbering guests created next) while libvirt was
reconnecting to the guest.
With the threaded implementation this opens a possibility for race
conditions with the thread that is autostarting guests. When there's a
guest running with id 1 and the daemon is restarted. The autostart code
is reached first and spawns the first guest that should be autostarted
as id 1. This results into the following unwanted situation:
# virsh list
Id Name State
----------------------------------------------------
1 guest1 running
1 guest2 running
This patch extracts the detection code before the re-connection threads
are started so that the maximum id of the guests being reconnected to is
known.
The only semantic change created by this is if the guest with greatest ID
quits before we are able to reconnect it's ID is used anyway as the
greatest one as without this patch the greatest ID of a process we could
successfuly reconnect to would be used.
82507838 refactored the code to keep both the raw and canonicalized form
of the backingStore, which breaks badly when the storage pool contains a
storage volume, which is missing its backing store file:
# ./daemon/libvirtd -l
2012-11-07 12:43:33.279+0000: 22175: info : libvirt version: 1.0.0
2012-11-07 12:43:33.279+0000: 22175: error : absolutePathFromBaseFile:542 : Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
2012-11-07 12:43:33.280+0000: 22175: error : storageDriverAutostart:115 : Failed to autostart storage pool 'default': Can't canonicalize path '/var/lib/libvirt/images/base.qcow2': No such file or directory
This is because virStorageFileGetMetadataFromBuf() aborts with -1 if the
filename of the backingStore can not be canonicalized:
#0 absolutePathFromBaseFile () at util/storage_file.c:541
#1 virStorageFileGetMetadataFromBuf () at util/storage_file.c:728
#2 virStorageFileGetMetadataFromFD () at util/storage_file.c:932
#3 virStorageBackendProbeTarget () at storage/storage_backend_fs.c:94
#4 virStorageBackendFileSystemRefresh () at storage/storage_backend_fs.c:849
#5 storagePoolStart () at storage/storage_driver.c:700
#6 virStoragePoolCreate () at libvirt.c:12471
...
Treat files which miss their backing file as standalone files.
Signed-off-by: Philipp Hahn <hahn@univention.de>
This patch adds support for external disk snapshots of inactive domains.
The snapshot is created by calling using qemu-img by calling:
qemu-img create -f format_of_snapshot -o
backing_file=/path/to/src,backing_fmt=format_of_backing_image
/path/to/snapshot
in case the backing image format is known or probing is allowed and
otherwise:
qemu-img create -f format_of_snapshot -o backing_file=/path/to/src
/path/to/snapshot
on each of the disks selected for snapshotting. This patch also modifies
the snapshot preparing function to support creating external snapshots
and to sanitize arguments. For now the user isn't able to mix external
and internal snapshots but this restriction might be lifted in the
future.
Some operations, APIs needs domain to be paused prior operation can be
performed, e.g. (managed-) save of a domain. The processors should be
restored in the end. However, if 'cont' fails for some reason, we log a
message but this is not sufficient as an event should be emitted as
well. Mgmt application can then decide what to do.
The code that was split out into the qemuDomainSaveMemory expands the
pointer containing the XML description of the domain that it gets from
higher layers. If the pointer changes the old one is invalid and the
upper layer function tries to free it causing an abort.
This patch changes the expansion of the original string to a new
allocation and copy of the contents.
After the connection to ESX 5.1 being broken since g1e7cd39, the fix
in bab7752c helped a bit, but still missed a spot, so the connection
is now successful, but some APIs (for example defineXML) don't work.
Two cases missing are added in this patch to avoid that.
qemu is sensitive to the order of arguments passed. Hence, if a
device requires a controller, the controller cmd string must
precede device cmd string. The same apply for controllers, when
for instance ccid controller requires usb controller. So
controllers create partial ordering in which they should be added
to qemu cmd line.
Some FDs may not implement fdatasync() functionality,
e.g. pipes. In that case EINVAL or EROFS is returned.
We don't want to fail then nor report any error.
Reported-by: Christophe Fergeau <cfergeau@redhat.com>
Some of the pre-snapshot check have restrictions wired in regarding
configuration options that influence taking of external checkpoints.
This patch removes restrictions that would inhibit taking of such a
snapshot.
This patch adds support to take external system checkpoints.
The functionality is layered on top of the previous disk-only snapshot
code. When the checkpoint is requested the domain memory is saved to the
memory image file using migration to file. (The user may specify to
take the memory image while the guest is live with the
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag.)
The memory save image shares format with the image created by
virDomainSave() API.
Before now, libvirt supported only internal snapshots for active guests.
This patch renames this function to qemuDomainSnapshotCreateActiveInternal
to prepare the grounds for external active snapshots.
The new external system checkpoints will require an async job while the
snapshot is taken. This patch adds QEMU_ASYNC_JOB_SNAPSHOT to track this
job type.
The default behavior while creating external checkpoints is to pause the
guest while the memory state is captured. We want the users to sacrifice
space saving for creating the memory save image while the guest is live
to minimize downtime.
This patch adds a flag that causes the guest not to be paused before
taking the snapshot.
*include/libvirt/libvirt.h.in:
- add new paused reason: VIR_DOMAIN_PAUSED_SNAPSHOT
- add new flag for taking snapshot: VIR_DOMAIN_SNAPSHOT_CREATE_LIVE
*tools/virsh-domain-monitor.c:
- add string representation for VIR_DOMAIN_PAUSED_SNAPSHOT
*tools/virsh-snapshot.c:
- add support for VIR_DOMAIN_SNAPSHOT_CREATE_LIVE
*tools/virsh.pod:
- add docs for --live option added to use
VIR_DOMAIN_SNAPSHOT_CREATE_LIVE flag
The code that saves domain memory by migration to file can be reused
while doing external checkpoints of a machine. This patch extracts the
common code and places it in a separate function.
When pausing the guest while migration is running (to speed up
convergence) the virDomainSuspend API checks if the migration job is
active before entering the job. This could cause a possible race if the
virDomainSuspend is called while the job is active but ends before the
Suspend API enters the job (this would require that the migration is
aborted). This would cause a incorrect event to be emitted.
Both system checkpoint snapshots and disk snapshots were iterating
over all disks, doing a final sanity check before doing any work.
But since future patches will allow offline snapshots to be either
external or internal, it makes sense to share the pass over all
disks, and then relax restrictions in that pass as new modes are
implemented. Future patches can then handle external disks when
the domain is offline, then handle offline --disk-snapshot, and
finally, combine with migration to file to gain a complete external
system checkpoint snapshot of an active domain without using 'savevm'.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotIsAllowed): Merge...
(qemuDomainSnapshotPrepare): ...into one function.
(qemuDomainSnapshotCreateXML): Update caller.
Now that the XML supports listing internal snapshots, it is worth
always populating the <memory> and <disks> element to match.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): Always
parse disk info and set memory info.
There were not previous callers with require_match set to true.
I originally implemented this bool with the intent of supporting
ESX snapshot semantics, where the choice of internal vs. external
vs. non-checkpointable must be made at domain start, but as ESX
has not been wired up to use it yet, we might as well fix it to
work with our next qemu patch for now, and worry about any further
improvements (changing the bool to a flags argument) if the ESX
driver decides to use this function in the future.
* src/conf/snapshot_conf.c (virDomainSnapshotAlignDisks): Alter
logic when require_match is true to deal with new XML.
Each <domainsnapshot> can now contain an optional <memory>
element that describes how the VM state was handled, similar
to disk snapshots. The new element will always appear in
output; for back-compat, an input that lacks the element will
assume 'no' or 'internal' according to the domain state.
Along with this change, it is now possible to pass <disks> in
the XML for an offline snapshot; this also needs to be wired up
in a future patch, to make it possible to choose internal vs.
external on a per-disk basis for each disk in an offline domain.
At that point, using the --disk-only flag for an offline domain
will be able to work.
For some examples below, remember that qemu supports the
following snapshot actions:
qemu-img: offline external and internal disk
savevm: online internal VM and disk
migrate: online external VM
transaction: online external disk
=====
<domainsnapshot>
<memory snapshot='no'/>
...
</domainsnapshot>
implies that there is no VM state saved (mandatory for
offline and disk-only snapshots, not possible otherwise);
using qemu-img for offline domains and transaction for online.
=====
<domainsnapshot>
<memory snapshot='internal'/>
...
</domainsnapshot>
state is saved inside one of the disks (as in qemu's 'savevm'
system checkpoint implementation). If needed in the future,
we can also add an attribute pointing out _which_ disk saved
the internal state; maybe disk='vda'.
=====
<domainsnapshot>
<memory snapshot='external' file='/path/to/state'/>
...
</domainsnapshot>
This is not wired up yet, but future patches will allow this to
control a combination of 'virsh save /path/to/state' plus disk
snapshots from the same point in time.
=====
So for 1.0.1 (and later, as needed), I plan to implement this table
of combinations, with '*' designating new code and '+' designating
existing code reached through new combinations of xml and/or the
existing DISK_ONLY flag:
domain memory disk disk-only | result
-----------------------------------------
offline omit omit any | memory=no disk=int, via qemu-img
offline no omit any |+memory=no disk=int, via qemu-img
offline omit/no no any | invalid combination (nothing to snapshot)
offline omit/no int any |+memory=no disk=int, via qemu-img
offline omit/no ext any |*memory=no disk=ext, via qemu-img
offline int/ext any any | invalid combination (no memory to save)
online omit omit off | memory=int disk=int, via savevm
online omit omit on | memory=no disk=default, via transaction
online omit no/ext off | unsupported for now
online omit no on | invalid combination (nothing to snapshot)
online omit ext on | memory=no disk=ext, via transaction
online omit int off |+memory=int disk=int, via savevm
online omit int on | unsupported for now
online no omit any |+memory=no disk=default, via transaction
online no no any | invalid combination (nothing to snapshot)
online no int any | unsupported for now
online no ext any |+memory=no disk=ext, via transaction
online int/ext any on | invalid combination (disk-only vs. memory)
online int omit off |+memory=int disk=int, via savevm
online int no/ext off | unsupported for now
online int int off |+memory=int disk=int, via savevm
online ext omit off |*memory=ext disk=default, via migrate+trans
online ext no off |+memory=ext disk=no, via migrate
online ext int off | unsupported for now
online ext ext off |*memory=ext disk=ext, via migrate+transaction
* docs/schemas/domainsnapshot.rng (memory): New RNG element.
* docs/formatsnapshot.html.in: Document it.
* src/conf/snapshot_conf.h (virDomainSnapshotDef): New fields.
* src/conf/domain_conf.c (virDomainSnapshotDefFree)
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Manage new fields.
* tests/domainsnapshotxml2xmltest.c: New test.
* tests/domainsnapshotxml2xmlin/*.xml: Update existing tests.
* tests/domainsnapshotxml2xmlout/*.xml: Likewise.
The libvirt coding standard is to use 'function(...args...)'
instead of 'function (...args...)'. A non-trivial number of
places did not follow this rule and are fixed in this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When assigning the new persistent definition for a transient network
(thus making it persistent) the network needs to be marked persistent
before actually atempting to assign the definition.
Until now, the network undefine API was able to undefine only inactive
networks. The restriction doesn't make sense any more so this patch
implements changing networks to transient.
When a transient network was created some of the checks weren't run on
the definition allowing to start invalid networks.
This patch splits out code to the network validation function and
re-uses that code when creating transient networks.
The network driver didn't care about config files when a network was
destroyed, just when it was undefined leaving behind files for transient
networks.
This patch splits out the cleanup code to a helper function that handles
the cleanup if the inactive network object is being removed and re-uses
this code when getting rid of inactive networks.
The hosts file was created in the network definition function. This
patch moves the place the file is being created to the point where
dnsmasq is being started.
With our fix of mkostemp (pushed as 2b435c15) we define a macro
to compile with uclibc. However, this definition is conditional
and thus needs to be properly indented. Moreover, with this definition
sc_prohibit_mkstemp syntax-check rule keeps yelling:
src/util/logging.c:63:# define mkostemp(x,y) mkstemp(x)
maint.mk: use mkostemp with O_CLOEXEC instead of mkstemp
Therefore we should ignore this file for this rule.
BZ:https://bugzilla.redhat.com/show_bug.cgi?id=871273
when using virsh qemu-attach to attach an existing qemu process,
if it misses the -M option in qemu command line, libvirtd crashed
because the NULL value of def->os.machine in later use.
Example:
/usr/libexec/qemu-kvm -name foo \
-cdrom /var/lib/libvirt/images/boot.img \
-monitor unix:/tmp/demo,server,nowait \
error: End of file while reading data: Input/output error
error: Failed to reconnect to the hypervisor
This patch tries to set default machine type if the value of
def->os.machine is still NULL after qemu command line parsing.
* configure.ac docs/news.html.in libvirt.spec.in: update for the new release
* po/*.po*: update from transifex, a lot of added support e.g. Indian
languages, and regenerate
It turns out that calling virNodeGetCPUMap(conn, NULL, NULL, 0)
is both useful, and with Viktor's patches, common enough to
optimize. Since this interface hasn't been released yet, we
can change the RPC call.
A bit more background on the optimization - learning the cpu count
is a single file read (/sys/devices/system/cpu/possible), but
learning the number of online cpus can possibly trigger a file
read per cpu, depending on the age of the kernel, and all wasted
if the caller passed NULL for both arguments.
* src/nodeinfo.c (nodeGetCPUMap): Avoid bitmap when not needed.
* src/remote/remote_protocol.x (remote_node_get_cpu_map_args):
Supply two separate flags for needed arguments.
* src/remote/remote_driver.c (remoteNodeGetCPUMap): Update
caller.
* daemon/remote.c (remoteDispatchNodeGetCPUMap): Likewise.
* src/remote_protocol-structs: Regenerate.
Per the code comment in qemuCapsInitQMPBasic() and commit 43e23c7, we
should only use QMP for capabilities probing starting with 1.2 and
newer. The old code had dead logic that probed on 1.0 and newer.
Signed-off-by: Eric Blake <eblake@redhat.com>
This needs to be done before the container starts. Turning
off the mknod capability is noticed by systemd, which will
no longer attempt to create device nodes.
This eliminates SELinux AVC messages and ugly failure messages in the journal.
The string comparison logic was inverted and matched the first drive
that does *not* have the name we search for.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The QEMU -drive id= begins with libvirt's QEMU host drive prefix
("drive-"), which is stripped off in several places two convert between
host ("-drive") and guest ("-device") device names.
In the case of BlkIoTune it is unnecessary to strip the QEMU host drive
prefix because we operate on "info block"/"query-block" output that uses
host drive names.
Stripping the prefix incorrectly caused string comparisons to fail since
we were comparing the guest device name against the host device name.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently, when we are doing (managed) save, we insert the
iohelper between the qemu and OS. The pipe is created, the
writing end is passed to qemu and the reading end to the
iohelper. It reads data and write them into given file. However,
with write() being asynchronous data may still be in OS
caches and hence in some (corner) cases, all migration data
may have been read and written (not physically though). So
qemu will report success, as well as iohelper. However, with
some non local filesystems, where ENOSPACE is polled every X
time units, we may get into situation where all operations
succeeded but data hasn't reached the disk. And in fact will
never do. Therefore we ought sync caches to make sure data
has reached the block device on remote host.
QEMU uses 'i386' for its 32-bit x86 architecture, but libvirt
wants that to be 'i686', so we must fix it up
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virPidFileReadPathIfAlive passed in an 'int *' where a 'pid_t *'
was expected, which breaks on Mingw64 targets. Also a few places
were using '%d' for formatting pid_t, change them to '%lld' and
force a cast to the longer type as done elsewhere in the same
file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=871756
Commit cd1e8d1 assumed that systems new enough to have journald
also have mkostemp; but this is not true for uclibc.
For that matter, use of mkstemp[s] is unsafe in a multi-threaded
program. We should prefer mkostemp[s] in the first place.
* bootstrap.conf (gnulib_modules): Add mkostemp, mkostemps; drop
mkstemp and mkstemps.
* cfg.mk (sc_prohibit_mkstemp): New syntax check.
* tools/virsh.c (vshEditWriteToTempFile): Adjust caller.
* src/qemu/qemu_driver.c (qemuDomainScreenshot)
(qemudDomainMemoryPeek): Likewise.
* src/secret/secret_driver.c (replaceFile): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainScreenshot): Likewise.
https://bugzilla.redhat.com/show_bug.cgi?id=871312
Recent fixes made almost all the right steps to make emulator pinned
to the cpuset of the whole domain in case <emulatorpin> isn't
specified, but qemudDomainGetEmulatorPinInfo still reports all the
CPUs even when cpuset is specified. This patch fixes that.
There are multiple reasons canonicalize_file_name() used in
absolutePathFromBaseFile helper can fail. This patch enhances error
reporting from that helper.
When there is no 'qemu-kvm' binary and the emulator used for a machine
is, for example, 'qemu-system-x86_64' that, by default, runs without
kvm enabled, libvirt still supplies '-no-kvm' option to this process,
even though it does not recognize such option (making the start of a
domain fail in that case).
This patch fixes building a command-line for QEMU machines without KVM
acceleration and is based on following assumptions:
- QEMU_CAPS_KVM flag means that QEMU is running KVM accelerated
machines by default (without explicitly requesting that using a
command-line option). It is the closest to the truth according to
the code with the only exception being the comment next to the
flag, so it's fixed in this patch as well.
- QEMU_CAPS_ENABLE_KVM flag means that QEMU is, by default, running
without KVM acceleration and in case we need KVM acceleration it
needs to be explicitly instructed to do so. This is partially
true for the past (this option essentially means that QEMU
recognizes the '-enable-kvm' option, even though it's almost the
same).
Three FORWARD chain rules are added and two INPUT chain rules
are added when a network is started but only the FORWARD chain
rules are removed when the network is destroyed.
I noticed this while answering a list question about Java bindings
of volume creation. All other functions that take xml logged xmlDesc.
* src/libvirt.c (virStorageVolCreateXML)
(virStorageVolCreateXMLFrom): Use consistent spelling of xmlDesc,
and log the argument.
This patch resolves: https://bugzilla.redhat.com/show_bug.cgi?id=871201
If libvirt is restarted after updating the dnsmasq or radvd packages,
a subsequent "virsh net-destroy" will fail to kill the dnsmasq/radvd
process.
The problem is that when libvirtd restarts, it re-reads the dnsmasq
and radvd pidfiles, then does a sanity check on each pid it finds,
including checking that the symbolic link in /proc/$pid/exe actually
points to the same file as the path used by libvirt to execute the
binary in the first place. If this fails, libvirt assumes that the
process is no longer alive.
But if the original binary has been replaced, the link in /proc is set
to "$binarypath (deleted)" (it literally has the string " (deleted)"
appended to the link text stored in the filesystem), so even if a new
binary exists in the same location, attempts to resolve the link will
fail.
In the end, not only is the old dnsmasq/radvd not terminated when the
network is stopped, but a new dnsmasq can't be started when the
network is later restarted (because the original process is still
listening on the ports that the new process wants).
The solution is, when the initial "use stat to check for identical
inodes" check for identity between /proc/$pid/exe and $binpath fails,
to check /proc/$pid/exe for a link ending with " (deleted)" and if so,
truncate that part of the link and compare what's left with the
original binarypath.
A twist to this problem is that on systems with "merged" /sbin and
/usr/sbin (i.e. /sbin is really just a symlink to /usr/sbin; Fedora
17+ is an example of this), libvirt may have started the process using
one path, but /proc/$pid/exe lists a different path (indeed, on F17
this is the case - libvirtd uses /sbin/dnsmasq, but /proc/$pid/exe
shows "/usr/sbin/dnsmasq"). The further bit of code to resolve this is
to call virFileResolveAllLinks() on both the original binarypath and
on the truncated link we read from /proc/$pid/exe, and compare the
results.
The resulting code still succeeds in all the same cases it did before,
but also succeeds if the binary was deleted or replaced after it was
started.
After separating 5.x and 5.1 versions of ESX, we forgot to add 5.1
into the list of allowed connections, so connections to 5.1 fail since
v1.0.0-rc1-5-g1e7cd39
Ever since commit eefb881, ATTRIBUTE_NONNULL has normally been a
no-op under gcc (since it tends to cause more bugs than it cures
given gcc's current lame implementation of the attribute). However,
the macro is still useful to Coverity and other static-analysis
tools, but only if we use it correctly. Coverity follows gcc's lead
in accepting function declarations with attributes at the end, but
function bodies must attach attributes to the return type. That is,
these are valid:
void foo(void *arg) ATTRIBUTE_NONNULL(1);
void ATTRIBUTE_NONNULL(1) foo(void *arg);
void ATTRIBUTE_NONNULL(1) foo(void *arg) {}
but this is not:
void foo(void *arg) ATTRIBUTE_NONNULL(1) {}
even though you don't get a compile failure until you do static
analysis. Bug introduced in commit 80533ca, with these symptoms:
nodeinfo.c:206: error: expected ',' or ';' before '{' token
cc1: warning: unrecognized command line option "-Wno-suggest-attribute=const"
cc1: warning: unrecognized command line option "-Wno-suggest-attribute=pure"
make[3]: *** [libvirt_driver_la-nodeinfo.lo] Error 1
* src/nodeinfo.c (virNodeParseNode): Fix syntax error when
non-null attribute is in use.
Commit 34e8f63a3 altered virfile.o to drag in additional symbols,
which in turn led to pulling in other .o files and eventually causing
a link failure when systemtap probes are enabled, such as:
./.libs/libvirt_util.a(libvirt_util_la-event_poll.o): In function `virEventPollRunOnce':
/home/dummy/libvirt/src/util/event_poll.c:614: undefined reference to `libvirt_event_poll_run_semaphore'
./.libs/libvirt_util.a(libvirt_util_la-event_poll.o):(.note.stapsdt+0x24): undefined reference to `libvirt_event_poll_add_handle_semaphore'
Even though libvirt_iohelper and libvirt_parthelper don't directly
use the portion of virfile.o that drags in probing, it was easier
to satisfy the linker and get the build back up, than to figure out
whether it is even possible or worth trying to disentangle the mess.
* src/Makefile.am (libvirt_iohelper_LDADD)
(libvirt_parthelper_LDADD): Use libvirt_probes.lo when needed.
Currently, we use iohelper when saving/restoring a domain.
However, if there's some kind of error (like I/O) it is not
propagated to libvirt. Since it is not qemu who is doing
the actual write() it will not get error. The iohelper does.
Therefore we should check for iohelper errors as it makes
libvirt more user friendly.
In the XML warning, we print a virsh command line that can be used to
edit that XML. This patch prints UUIDs if the entity name contains
special characters (like shell metacharacters, or "--" that would break
parsing of the XML comment). If the entity doesn't have a UUID, just
print the virsh command that can be used to edit it.
When using block copy to pivot over to a new chain, the backing files
for the new chain might still need labeling (particularly if the user
passes --reuse-ext with a relative backing file name). Relabeling a
file that is already labeled won't hurt, so this just labels the entire
chain at the point of the pivot. Doing the relabel of the chain uses
the fact that we already safely probed the file type of an external
file at the start of the block copy.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot): Relabel chain before
asking qemu to pivot.
Use the recent addition of qemuDomainPrepareDiskChainElement to
obtain locking manager lease, permit a block device through cgroups,
and set the SELinux label; then audit the fact that we hand a new
file over to qemu. Alas, releasing the lease and label at the end
of the mirroring is a trickier prospect (we would have to trace the
backing chain of both source and destination, and be sure not to
revoke rights to any part of the chain that is shared), so for now,
virDomainBlockJobAbort still leaves things with additional access
granted (as block-pull and block-commit have the same problem of
not clamping access after completion, a future cleanup would cover
all three commands).
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set up labeling.