To unify future additions that require information from "query-chardev"
rename qemuMonitorGetPtyPaths and friends to qemuMonitorGetChardevInfo
and move the allocation of the returned hash into the top level
function.
When creating a disk image snapshot the libvirt code would blindly copy
the parents label to the newly created image. This runs into problems
when you start a VM from an image hosted on NFS (or other storage system
that doesn't support selinux labels) and the snapshot destination is on
a storage system that does support selinux labels. Libvirt's code in
that case generates a different security label for the image hosted on
NFS. This label is valid only for NFS images and doesn't allow access in
case of a locally stored image.
To fix this issue libvirt needs to refrain from copying security
information in cases where the default domain seclabel is a better
choice.
This patch repurposes the now unused @force argument of
virStorageSourceInitChainElement to denote whether a copy of the
security labelling stuff should be attempted or not. This allows to
fine-control the copy operation for cases where we need to keep the
label of the old disk vs. the cases where we need to keep the label
unset to use the default domain imagelabel.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1151718
Ethernet interfaces in libvirt currently do not support bandwidth setting.
For example, following xml file for an interface will not apply these
settings to corresponding qdiscs.
<interface type="ethernet">
<mac address="02:36:1d:18:2a:e4"/>
<model type="virtio"/>
<script path=""/>
<target dev="tap361d182a-e4"/>
<bandwidth>
<inbound average="984" peak="1024" burst="64"/>
<outbound average="2000" peak="2048" burst="128"/>
</bandwidth>
</interface>
Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit 6e5c79a1 tried to fix deadlock between nwfilter{Define,Undefine}
and starting of guest, but this same deadlock exists for
updating/attaching network device to domain.
The deadlock was introduced by removing global QEMU driver lock because
nwfilter was counting on this lock and ensure that all driver locks are
locked inside of nwfilter{Define,Undefine}.
This patch extends usage of virNWFilterReadLockFilterUpdates to prevent
the deadlock for all possible paths in QEMU driver. LXC and UML drivers
still have global lock.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1143780
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In one of my previous patches (3a3c3780b) I've tried to fix the
problem of nvram path disappearing on a domain that's been
started and shut down again. I fixed this by explicitly saving
domain's config file. However, I did a bit of clumsy without
realizing we have a transient domains for which we don't save the
config file. Hence, any domain using UEFI became persistent.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1160084
As of b6d4dad1 (1.2.5) libvirt keeps track if domain disks have been
frozen. However, this falls into that set of information which don't
survive domain restart. Therefore, we need to clear the flag upon some
state transitions. Moreover, once we clear the flag we must update the
status file too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a reaction to Michal's fix [1] for non-NUMA systems that also
splits out conf/ out of util/ because libvirt_util shouldn't require
libvirt_conf if it is the other way around. This particular use case
worked, but we're trying to avoid it as mentioned [2], many times.
The only functions from virnuma.c that needed numatune_conf were
virDomainNumatuneNodesetIsAvailable() and virNumaSetupMemoryPolicy().
The first one should be in numatune_conf as it works with
virDomainNumatune, the second one just needs nodeset and mode, both of
which can be passed without the need of numatune_conf.
Apart from fixing that, this patch also fixes recently added
code (between commits d2460f85^..5c8515620) that doesn't support
non-contiguous nodesets. It uses new function
virNumaNodesetIsAvailable(), which doesn't need a stub as it doesn't use
any libnuma functions, to check if every specified nodeset is available.
[1] https://www.redhat.com/archives/libvir-list/2014-November/msg00118.html
[2] http://www.redhat.com/archives/libvir-list/2011-June/msg01040.html
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Particularly in qemuBuildNumaArgStr(), there was a need for the advice
due to memory backing, which needs to know the nodeset it will be pinned
to. With newer qemu this caused the following error when starting
domain:
error: internal error: Advice from numad is needed in case of
automatic numa placement
even when starting perfectly valid domain, e.g.:
...
<vcpu placement='auto'>4</vcpu>
<numatune>
<memory mode='strict' placement='auto'/>
</numatune>
<cpu>
<numa>
<cell id='0' cpus='0' memory='524288'/>
<cell id='1' cpus='1' memory='524288'/>
</numa>
</cpu>
...
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1138545
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
commit 3e1e16aa8d (Use a port from the
migration range for NBD as well) changed ndb port allocation from
remotePorts to migrationPorts, but did not change the port releasing
process, which makes an error when migrating several times (above 64):
error: internal error: Unable to find an unused port in range
'migration' (49152-49215)
https://bugzilla.redhat.com/show_bug.cgi?id=1159245
Signed-off-by: Weiwei Li <nuonuoli@tencent.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
If VM is configured with many devices(including passthrough devices)
and large memory, libvirtd will take seconds(in the worst case) to
wait for monitor. In this period the qemu process may run on any
PCPU though I intend to pin emulator to the specified PCPU in xml
configuration.
Actually qemu process takes high cpu usage during vm startup.
So this is not the strict CPU isolation in this case.
Signed-off-by: Zhou yimin <zhouyimin@huawei.com>
NIC_RX_FILTER_CHANGED is sent by qemu any time a NIC driver in the
guest modified the NIC's RX Filter (for example, if the MAC address of
the NIC is changed by the guest).
This patch doesn't do anything useful with that event; it just sets up
all the plumbing to get news of the event into a worker thread with
all proper locking/reference counting, and provide an easy place to
add in desired functionality.
See src/qemu/EVENTHANDLERS.txt for information/instructions on adding
a libvirt-internal handler for a qemu event (using
NIC_RX_FILTER_CHANGED as an example).
If we don't properly clean up all processes in the
machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm
starts fail with
'CreateMachine: File exists'
Additional processes can e.g. be added via
echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks
but there are other cases like
http://bugs.debian.org/761521
Invoke TerminateMachine to be on the safe side since systemd tracks the
cgroup anyway. This is a noop if all processes have terminated already.
Commit fba6bc4 introduced the non-migratable invtsc feature,
breaking save/migration with host-model and host-passthrough.
On hosts with this feature present it was automatically included
in the CPU definition, regardless of QEMU support.
Commit de0aeaf stopped including it by default for host-model,
but failed to fix host-passthrough.
This commit ignores checking of CPU features with host-passthrough,
since we don't pass them to QEMU (only -cpu host is passed),
allowing domains using host-passthrough that were saved with
the broken version of libvirtd to be restored.
https://bugzilla.redhat.com/show_bug.cgi?id=1147584
On a domain startup, the variable store path is generated if needed.
The path is intended to be generated only once. However, the updated
domain definition is not saved into config dir rather than state XML
only. So later, whenever the domain is destroyed and the daemon is
restarted, the generated path is forgotten and the file may be left
behind on virDomainUndefine() call.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Request erroring out from the backing chain traveller and drop qemu's
internal backing chain integrity tester.
The backing chain traveller reports errors by itself with possibly more
detail than qemuDiskChainCheckBroken ever could.
We also need to make sure that we reconnect to existing qemu instances
even at the cost of losing the backing chain info (this really should be
stored in the XML rather than reloaded from disk, but that needs some
work).
Commit id '9a2f36ec' added a build conditional of CAP_SYS_RAWIO
in order to determine whether or not a disk definition using rawio
should be allowed on platforms without CAP_SYS_RAWIO. If one was
found, virReportError was used but the code didn't goto cleanup.
This patch adds the goto.
We are not detecting the presence of FIPS from QEMU, but from procfs and
that means it's not QEMU capability. It was decided that we will pass
this flag to QEMU even if it's not supported by old QEMU binaries.
This patch also reverts changes done by commit a21cfb0f to
qemucapabilitestest and implements a new test case in qemuxml2argvtest.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1135431
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
If there are no iothreads, then return from qemuProcessDetectIOThreadPIDs
without error; otherwise, the following occurs:
error: Failed to start domain $dom
error: An error occurred, but the cause is unknown
Modify qemuProcessStart() in order to allowing setting affinity to
specific CPU's for IOThreads. The process followed is similar to
that for the vCPU's.
This involves adding a function to fetch the IOThread id's via
qemuMonitorGetIOThreads() and adding them to iothreadpids[] list.
Then making sure all the cgroup data has been properly set up and
finally assigning affinity.
In qemuProcessInitPCIAddresses() if qemuMonitorGetAllPCIAddresses()
returns a negative (or zero) value, then no need to call the
qemuProcessDetectPCIAddresses().
Signed-off-by: John Ferlan <jferlan@redhat.com>
When using split UEFI image, it may come handy if libvirt manages per
domain _VARS file automatically. While the _CODE file is RO and can be
shared among multiple domains, you certainly don't want to do that on
the _VARS file. This latter one needs to be per domain. So at the
domain startup process, if it's determined that domain needs _VARS
file it's copied from this master _VARS file. The location of the
master file is configurable in qemu.conf.
Temporary, on per domain basis the location of master NVRAM file can
be overridden by this @template attribute I'm inventing to the
<nvram/> element. All it does is holding path to the master NVRAM file
from which local copy is created. If that's the case, the map in
qemu.conf is not consulted.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Total time of a migration and total downtime transfered from a source to
a destination host do not count with the transfer time to the
destination host and with the time elapsed before guest CPUs are
resumed. Thus, source libvirtd remembers when migration started and when
guest CPUs were paused. Both timestamps are transferred to destination
libvirtd which uses them to compute total migration time and total
downtime. Obviously, this requires the time to be synchronized between
the two hosts. The reported times are useless otherwise but they would
be equally useless if we didn't do this recomputation so don't lose
anything by doing it.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When QEMU fails during incoming migration after we successfully started
it (i.e., during Perform or Finish phase), we report a rather unhelpful
message
Unable to read from monitor: Connection reset by peer
We already have a code that takes error messages from QEMU's error
output but we disable it once QEMU successfully starts. This patch
postpones this until the end of Finish phase during incoming migration
so that we can report a much better error message:
internal error: early end of file from monitor: possible problem:
Unknown savevm section or instance '0000:00:05.0/virtio-balloon' 0
load of migration failed
https://bugzilla.redhat.com/show_bug.cgi?id=1090093
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Add umask to _virCommand, allow user to set umask to command.
Set umask(002) to qemu process to overwrite the default umask
of 022 set by many distros, so that unix sockets created for
virtio-serial has expected permissions.
Fix problem reported here:
https://sourceware.org/bugzilla/show_bug.cgi?id=13078#c11https://bugzilla.novell.com/show_bug.cgi?id=888166
To use virtio-serial device, unix socket created for chardev with
default umask(022) has insufficient permissions.
e.g.:
-device virtio-serial \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-device virtserialport,chardev=foo,name=org.fedoraproject.port.0
srwxr-xr-x 1 qemu qemu 0 21. Jul 14:19 /tmp/somefile.sock
Other users in the same group (like real user, test engines, etc)
cannot write to this socket.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The 'min_guarantee' is used by VMware ESX and OpenVZ drivers,
with qemu however, libvirt should report error when starting a domain,
because this element is not used.
Resolves https://bugzilla.redhat.com/show_bug.cgi?id=1122455
Currently, qemu driver uses qemuTranslateDiskSourcePool()
to translate disk volume information. This function is
general enough and could be used for other drivers as well,
so move it to conf/domain_conf.c along with its helpers.
- qemuTranslateDiskSourcePool: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePool,
- qemuAddISCSIPoolSourceHost: move to storage/storage_driver.c
and rename to virStorageAddISCSIPoolSourceHost,
- qemuTranslateDiskSourcePoolAuth: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePoolAuth,
- Update users of qemuTranslateDiskSourcePool to use a
new name.
Pin existing vcpus rather than existing vcpu pinning infos. This
increases the complexity of the lookup, but avoids pinning cpus that are
not enabled actually.
When editing guest's XML (on QEMU), it was possible to add multiple
listen elements into graphics parent element. However QEMU does not
support listening on multiple addresses. Configuration is tested for
multiple 'listen address' and if positive, an error is raised.
https://bugzilla.redhat.com/show_bug.cgi?id=1119212
During a QEMU live migration several warning messages about job
handling could be written to syslog on the destination host:
"entering monitor without asking for a nested job is dangerous"
The messages are written because the job handling during migration
uses hard coded asyncJob values in several places that are incorrect.
This patch passes the required asyncJob value around and prevents
the warnings as well as any issues that the warnings may be referring
to.
https://bugzilla.redhat.com/show_bug.cgi?id=1130089
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
A future patch is going to wire up qemu active block commit jobs;
but as they have similar events and are canceled/pivoted in the
same way as block copy jobs, it is easiest to track all bookkeeping
for the commit job by reusing the <mirror> element. This patch
adds domain XML to track which job was responsible for creating a
mirroring situation, and adds a job='copy' attribute to all
existing uses of <mirror>. Along the way, it also massages the
qemu monitor backend to read the new field in order to generate
the correct type of libvirt job (even though it requires a
future patch to actually cause a qemu event that can be reported
as an active commit). It also prepares to update persistent XML
to match changes made to live XML when a copy completes.
* docs/schemas/domaincommon.rng: Enhance schema.
* docs/formatdomain.html.in: Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): Add a field.
* src/conf/domain_conf.c (virDomainBlockJobType): String conversion.
(virDomainDiskDefParseXML): Parse job type.
(virDomainDiskDefFormat): Output job type.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Distinguish
active from regular commit.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Set job type.
(qemuDomainBlockPivot, qemuDomainBlockJobImpl): Clean up job type
on completion.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror-old.xml:
Update tests.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-disk-active-commit.xml: New
file.
* tests/qemuxml2xmltest.c (mymain): Drive new test.
Signed-off-by: Eric Blake <eblake@redhat.com>
We were not directly saving the domain XML to file after starting
or finishing a blockcopy. Without the startup write, a libvirtd
restart in the middle of a copy job would forget that the job was
underway. Then at pivot, we were indirectly writing new XML in
reaction to events that occur as we stop and restart the guest CPUs.
But there was a race: since pivot is an async action, it is possible
that libvirtd is restarted before the pivot completes, so if XML
changes during the event, that change was not written. The original
blockcopy code cleared out the <mirror> element prior to restarting
the CPUs, but this is also a race, observed if a user does an async
pivot and a dumpxml before the event occurs. Furthermore, this race
will interfere with active commit in a future patch, because that
code will rely on the <mirror> element at the time of the qemu event
to determine whether to inform the user of a normal commit or an
active commit.
Fix things by saving state any time we modify live XML, while
delaying XML disk modifications until after the event completes. We
still need a to teach libvirtd restarts to examine all existing
<mirror> elements to see if the job completed in the meantime (that
is, if libvirtd misses the event, the updated state still needs to be
updated in live XML), but that will be a later patch, in part because
we also need to to start taking advantage of newer qemu's ability to
keep the job around after completion rather than the current usage
where the job disappears both on error and on success.
* src/qemu/qemu_driver.c (qemuDomainBlockCopy): Track XML change
on disk.
(qemuDomainBlockJobImpl, qemuDomainBlockPivot): Move job-end XML
rewrites...
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): ...here.
Signed-off-by: Eric Blake <eblake@redhat.com>
Doing a blockcopy operation across a libvirtd restart is not very
robust at the moment. In particular, we are clearing the <mirror>
element prior to telling qemu to finish the job. Also, thanks to the
ability to request async completion, the user can easily regain
control prior to qemu actually finishing the effort, and they should
be able to poll the domain XML to see if the job is still going.
A future patch will fix things to actually wait until qemu is done
before modifying the XML to reflect the job completion. But since
qemu issues identical BLOCK_JOB_COMPLETE events regardless of whether
the job was cancelled (kept the original disk) or completed (pivoted
to the new disk), we have to track which of the two operations were
used to end the job. Furthermore, we'd like to avoid attempts to
end a job where we are already waiting on an earlier request to qemu
to end the job. Likewise, if we miss the qemu event (perhaps because
it arrived during a libvirtd restart), we still need enough state
recorded to be able to determine how to modify the domain XML once
we reconnect to qemu and manually learn whether the job still exists.
Although this patch doesn't actually fix the problem, it is a
preliminary step that makes it possible to track whether a job
has already begun steps towards completion.
* src/conf/domain_conf.h (virDomainDiskMirrorState): New enum.
(_virDomainDiskDef): Convert bool mirroring to new enum.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Handle new values.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Adjust
client.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl): Likewise.
* docs/schemas/domaincommon.rng (diskMirror): Expose new values.
* docs/formatdomain.html.in (elementsDisks): Document it.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: Test it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Use better detection of hugetlbfs mount points. Yes, there can be
multiple mount points each serving different huge page size.
Since we already have ability to override the mount point in the
qemu.conf file, this crazy backward compatibility code is brought in.
Now we allow multiple mount points, so the "hugetlbfs_mount" option
must take an list of strings (mount points). But previously, it was
just a string, so we must accept both types now.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails. This is because cgroup restrict even kernel
allocations. We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.
This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways. On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).
In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:
s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g
then shrunk long lines, and mainly functions, that were created after
that:
sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'
And to cope with the enum name, I haad to change the constants as
well:
s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g
Last thing I did was at least a little shortening of already long
name:
s/virDomainNumatuneDef/virDomainNumatune/g
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This saves a few lines of code and catches the error when:
<spice autoport ='yes' defaultMode='any' ..>
<channel name='main' mode='secure'/>
</spice>
is specified with spice_tls = 0 in qemu.conf.
Instead of this error in qemuBuildGraphicsSPICECommandLine:
error: unsupported configuration: spice secure channels set in XML
configuration, but TLS port is not provided
an error is reported in qemuProcessSPICEAllocatePorts:
error: unsupported configuration: Auto allocation of spice TLS port
requested but spice TLS is disabled in qemu.conf
Inspired by:
https://www.redhat.com/archives/libvir-list/2014-June/msg01408.html
As we are doing with the enum structures, a cleanup in "src/qemu/"
directory was done now. All the enums that were defined in the
header files were converted to typedefs in this directory. This
patch includes all the adjustments to remove conflicts when you do
this kind of change. "Enum-to-typedef"'s conversions were made in
"src/qemu/qemu_{capabilities, domain, migration, hotplug}.h".
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
When looking for a port to allocate, the port allocator didn't take in
consideration ports that are statically set by the user. Defining
these two graphics elements in the XML would cause an error, as the
port allocator would try to use the same port for the spice graphics
element:
<graphics type='spice' autoport='yes'/>
<graphics type='vnc' port='5900' autoport='no'/>
The new *[pP]ortReserved variables keep track of the ports that were
successfully tracked as used by the port allocator but that weren't
bound.
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1081881
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
When the block job event was first added, it was for block pull,
where the active layer of the disk remains the same name. It was
also in a day where we only cared about local files, and so we
always had a canonical absolute file name. But two things have
changed since then: we now have network disks, where determining
a single absolute string does not really make sense; and we have
two-phase jobs (copy and active commit) where the name of the
active layer changes between the first event (ready, on the old
name) and second (complete, on the pivoted name).
Adam Litke reported that having an unstable string between events
makes life harder for clients. Furthermore, all of our API that
operate on a particular disk of a domain accept multiple strings:
not only the absolute name of the active layer, but also the
destination device name (such as 'vda'). As this latter name is
stable, even for network sources, it serves as a better string
to supply in block job events.
But backwards-compatibility demands that we should not change the
name handed to users unless they explicitly request it. Therefore,
this patch adds a new event, BLOCK_JOB_2 (alas, I couldn't think of
any nicer name - but at least Migrate2 and Migrate3 are precedent
for a number suffix). We must double up on emitting both old-style
and new-style events according to what clients have registered for
(see also how IOError and IOErrorReason emits double events, but
there the difference was a larger struct rather than changed
meaning of one of the struct members).
Unfortunately, adding a new event isn't something that can easily
be broken into pieces, so the commit is rather large.
* include/libvirt/libvirt.h.in (virDomainEventID): Add a new id
for VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2.
(virConnectDomainEventBlockJobCallback): Document new semantics.
* src/conf/domain_event.c (_virDomainEventBlockJob): Rename field,
to ensure we catch all clients.
(virDomainEventBlockJobNew): Add parameter.
(virDomainEventBlockJobDispose)
(virDomainEventBlockJobNewFromObj)
(virDomainEventBlockJobNewFromDom)
(virDomainEventDispatchDefaultFunc): Adjust clients.
(virDomainEventBlockJob2NewFromObj)
(virDomainEventBlockJob2NewFromDom): New functions.
* src/conf/domain_event.h: Add new prototypes.
* src/libvirt_private.syms (domain_event.h): Export new functions.
* src/qemu/qemu_driver.c (qemuDomainBlockJobImpl): Generate two
different events.
* src/qemu/qemu_process.c (qemuProcessHandleBlockJob): Likewise.
* src/remote/remote_protocol.x
(remote_domain_event_block_job_2_msg): New struct.
(REMOTE_PROC_DOMAIN_EVENT_BLOCK_JOB_2): New RPC.
* src/remote/remote_driver.c
(remoteDomainBuildEventBlockJob2): New handler.
(remoteEvents): Register new event.
* daemon/remote.c (remoteRelayDomainEventBlockJob2): New handler.
(domainEventCallbacks): Register new event.
* tools/virsh-domain.c (vshEventCallbacks): Likewise.
(vshEventBlockJobPrint): Adjust client.
* src/remote_protocol-structs: Regenerate.
Signed-off-by: Eric Blake <eblake@redhat.com>
The current implementation of 'virsh blockcopy' (virDomainBlockRebase)
is limited to copying to a local file name. But future patches want
to extend it to also copy to network disks. This patch converts over
to a virStorageSourcePtr, although it should have no semantic change
visible to the user, in anticipation of those future patches being
able to use more fields for non-file destinations.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of
mirror information.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): Localize
mirror parsing into new object.
(virDomainDiskDefFormat): Adjust clients.
* src/qemu/qemu_domain.c (qemuDomainDeviceDefPostParse):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockPivot)
(qemuDomainBlockJobImpl, qemuDomainBlockCopy): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
As part of the work on backing chains, I'm finding that it would
be easier to directly manipulate chains of pointers (adding a
snapshot merely adjusts pointers to form the correct list) rather
than copy data from one struct to another. This patch converts
domain disk source to be a pointer.
In this patch, the pointer is ALWAYS allocated (thanks in part to
the previous patch forwarding all disk def allocation through a
common point), and all other changse are just mechanical fallout of
the new type; there should be no functional change. It is possible
that we may want to leave the pointer NULL for a cdrom with no
medium in a later patch, but as that requires a closer audit of the
source to ensure we don't fault on a null dereference, I didn't do
it here.
* src/conf/domain_conf.h (_virDomainDiskDef): Change type of src.
* src/conf/domain_conf.c: Adjust all clients.
* src/security/security_selinux.c: Likewise.
* src/qemu/qemu_domain.c: Likewise.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_conf.c: Likewise.
* src/qemu/qemu_process.c: Likewise.
* src/qemu/qemu_migration.c: Likewise.
* src/qemu/qemu_driver.c: Likewise.
* src/lxc/lxc_driver.c: Likewise.
* src/lxc/lxc_controller.c: Likewise.
* tests/securityselinuxlabeltest.c: Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently, we don not acquire any job when removing a device after
DEVICE_DELETED event was received from QEMU. This means that if there is
another API running at the time DEVICE_DELETED is delivered and the API
acquired a job, we may happily change the definition of the domain the
API is working with whenever it unlocks the domain object (e.g., to talk
with its monitor). That said, we have to acquire a job before finishing
device removal to make things safe. However, doing so in the main event
loop would cause a deadlock so we need to move most of the event handler
into a separate thread.
Another good reason for both acquiring a job and handling the event in a
separate thread is that we currently remove a device backend immediately
after removing its frontend while we should only remove the backend once
we already received DEVICE_DELETED event. That is, we will have to talk
to QEMU monitor from the event handler.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
If QEMU supports DEVICE_DELETED event, we always call
qemuDomainRemoveDevice from the event handler. However, we will need to
push this call away from the main event loop and begin a job for it (see
the following commit), we need to make sure the device is fully removed
by the original thread (and within its existing job) in case the
DEVICE_DELETED event arrives before qemuDomainWaitForDeviceRemoval times
out.
Without this patch, device removals would be guaranteed to never finish
before the timeout because the could would be blocked by the original
job being still active.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1088787
Clean up unix socket files for chardevs using mode='bind',
like we clean up the monitor socket.
They are created by QEMU on startup and not really useful
after shutting it down.
commit e31b5cf393 attempted to fix libvirt's
VIR_DOMAIN_EVENT_ID_RTC_CHANGE, which is documentated to always
provide the new offset of the domain's real time clock from UTC. The
problem was that, in the case that qemu is provided with an "-rtc
base=x" where x is an absolute time (rather than "utc" or
"localtime"), the offset sent by qemu's RTC_CHANGE event is *not* the
new offset from UTC, but rather is the sum of all changes to the
domain's RTC since it was started with base=x.
So, despite what was said in commit e31b5cf393, if we assume that
the original value stored in "adjustment" was the offset from UTC at
the time the domain was started, we can always determine the current
offset from UTC by simply adding the most recent (i.e. current) offset
from qemu to that original adjustment.
This patch accomplishes that by storing the initial adjustment in the
domain's status as "adjustment0". Each time a new RTC_CHANGE event is
received from qemu, we simply add adjustment0 to the value sent by
qemu, store that as the new adjustment, and forward that value on to
any event handler.
This patch (*not* e31b5cf393, which should be reverted prior to
applying this patch) fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=964177
(for the case where basis='utc'. It does not fix basis='localtime')
This reverts commit e31b5cf393.
This commit attempted to work around a bug in the offset value
reported by qemu's RTC_CHANGE event in the case that a variable base
date was given on the qemu commandline. The patch mixed up the math
involved in arriving at the corrected offset to report, and in the
process added an unnecessary private attribute to the clock
element. Since that element is private/internal and not used by anyone
else, it makes sense to simplify things by removing it.
Refresh the disk backing chains when reconnecting to a qemu process
after daemon restart. There are a few internal fields that don't get
refreshed from the XML. Until we are able to do that, let's reload all
the metadata by the backing chain crawler.
As a side effect, the return value of qemuDomainObjEnterMonitorAsync is
not directly used as the return value of qemuProcess{Start,Stop}CPUs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Move sharable PCI handling functions to domain_addr.[ch], and
change theirs prefix from 'qemu' to 'vir':
- virDomainPCIAddressAsString;
- virDomainPCIAddressBusSetModel;
- virDomainPCIAddressEnsureAddr;
- virDomainPCIAddressFlagsCompatible;
- virDomainPCIAddressGetNextSlot;
- virDomainPCIAddressReleaseSlot;
- virDomainPCIAddressReserveAddr;
- virDomainPCIAddressReserveNextSlot;
- virDomainPCIAddressReserveSlot;
- virDomainPCIAddressSetFree;
- virDomainPCIAddressSetGrow;
- virDomainPCIAddressSlotInUse;
- virDomainPCIAddressValidate;
The only change here is function names, the implementation itself
stays untouched.
Extract common allocation code from DomainPCIAddressSetCreate
into virDomainPCIAddressSetAlloc.
In "src/util/" there are many enumeration (enum) declarations.
Sometimes, it's better using a typedef for variable types,
function types and other usages. Other enumeration will be
changed to typedef's in the future.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
I almost wrote a hash value free function that just called
VIR_FREE, then realized I couldn't be the first person to
do that. Sure enough, it was worth factoring into a common
helper routine.
* src/util/virhash.h (virHashValueFree): New function.
* src/util/virhash.c (virHashValueFree): Implement it.
* src/util/virobject.h (virObjectFreeHashData): New function.
* src/libvirt_private.syms (virhash.h, virobject.h): Export them.
* src/nwfilter/nwfilter_learnipaddr.c (virNWFilterLearnInit): Use
common function.
* src/qemu/qemu_capabilities.c (virQEMUCapsCacheNew): Likewise.
* src/qemu/qemu_command.c (qemuDomainCCWAddressSetCreate):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorGetBlockInfo): Likewise.
* src/qemu/qemu_process.c (qemuProcessWaitForMonitor): Likewise.
* src/util/virclosecallbacks.c (virCloseCallbacksNew): Likewise.
* src/util/virkeyfile.c (virKeyFileParseGroup): Likewise.
* tests/qemumonitorjsontest.c
(testQemuMonitorJSONqemuMonitorJSONGetBlockInfo): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
In order to reuse the newly-created host-side disk struct in
the virstoragefile backing chain code, I first have to move
it to util/. This starts the process, by first moving the
security label structures.
* src/conf/domain_conf.h (virDomainDefGenSecurityLabelDef)
(virDomainDiskDefGenSecurityLabelDef, virSecurityLabelDefFree)
(virSecurityDeviceLabelDefFree, virSecurityLabelDef)
(virSecurityDeviceLabelDef): Move...
* src/util/virseclabel.h: ...to new file.
(virSecurityLabelDefNew, virSecurityDeviceLabelDefNew): Rename the
GenSecurity functions.
* src/qemu/qemu_process.c (qemuProcessAttach): Adjust callers.
* src/security/security_manager.c (virSecurityManagerGenLabel):
Likewise.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityFileLabel): Likewise.
* src/util/virseclabel.c: New file.
* src/conf/domain_conf.c: Move security code, and fix fallout.
* src/Makefile.am (UTIL_SOURCES): Build new file.
* src/libvirt_private.syms (domain_conf.h): Move symbols...
(virseclabel.h): ...to new section.
Signed-off-by: Eric Blake <eblake@redhat.com>
It's finally time to start tracking disk backing chains in
<domain> XML. The first step is to start refactoring code
so that we have an object more convenient for representing
each host source resource in the context of a single guest
<disk>. Ultimately, I plan to move the new type into src/util
where it can be reused by virStorageFile, but to make the
transition easier to review, this patch just creates the
new type then fixes everything until it compiles again.
* src/conf/domain_conf.h (_virDomainDiskDef): Split...
(_virDomainDiskSourceDef): ...to new struct.
(virDomainDiskAuthClear): Use new type.
* src/conf/domain_conf.c (virDomainDiskDefFree): Split...
(virDomainDiskSourceDefClear): ...to new function.
(virDomainDiskGetType, virDomainDiskSetType)
(virDomainDiskGetSource, virDomainDiskSetSource)
(virDomainDiskGetDriver, virDomainDiskSetDriver)
(virDomainDiskGetFormat, virDomainDiskSetFormat)
(virDomainDiskAuthClear, virDomainDiskGetActualType)
(virDomainDiskDefParseXML, virDomainDiskSourceDefFormat)
(virDomainDiskDefFormat, virDomainDiskDefForeachPath)
(virDomainDiskDefGetSecurityLabelDef)
(virDomainDiskSourceIsBlockType): Adjust all users.
* src/lxc/lxc_controller.c (virLXCControllerSetupDisk):
Likewise.
* src/lxc/lxc_driver.c (lxcDomainAttachDeviceMknodHelper):
Likewise.
* src/qemu/qemu_command.c (qemuAddRBDHost, qemuParseRBDString)
(qemuParseDriveURIString, qemuParseGlusterString)
(qemuParseISCSIString, qemuParseNBDString)
(qemuDomainDiskGetSourceString, qemuBuildDriveStr)
(qemuBuildCommandLine, qemuParseCommandLineDisk)
(qemuParseCommandLine): Likewise.
* src/qemu/qemu_conf.c (qemuCheckSharedDevice)
(qemuAddISCSIPoolSourceHost, qemuTranslateDiskSourcePool):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainUpdateDeviceConfig)
(qemuDomainPrepareDiskChainElement)
(qemuDomainSnapshotCreateInactiveExternal)
(qemuDomainSnapshotPrepareDiskExternalBackingInactive)
(qemuDomainSnapshotPrepareDiskInternal)
(qemuDomainSnapshotPrepare)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotUndoSingleDiskActive)
(qemuDomainBlockPivot, qemuDomainBlockJobImpl)
(qemuDomainBlockCopy, qemuDomainBlockCommit): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsSafe): Likewise.
* src/qemu/qemu_process.c (qemuProcessGetVolumeQcowPassphrase)
(qemuProcessInitPasswords): Likewise.
* src/security/security_selinux.c
(virSecuritySELinuxSetSecurityFileLabel): Likewise.
* src/storage/storage_driver.c (virStorageFileInitFromDiskDef):
Likewise.
* tests/securityselinuxlabeltest.c (testSELinuxLoadDef):
Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Wire up all the pieces to send arbitrary qemu events to a
client using libvirt-qemu.so. If the extra bookkeeping of
generating event objects even when no one is listening turns
out to be noticeable, we can try to further optimize things
by adding a counter for how many connections are using events,
and only dump events when the counter is non-zero; but for
now, I didn't think it was worth the code complexity.
* src/qemu/qemu_driver.c
(qemuConnectDomainQemuMonitorEventRegister)
(qemuConnectDomainQemuMonitorEventDeregister): New functions.
* src/qemu/qemu_monitor.h (qemuMonitorEmitEvent): New prototype.
(qemuMonitorDomainEventCallback): New typedef.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONIOProcessEvent):
Report events.
* src/qemu/qemu_monitor.c (qemuMonitorEmitEvent): New function, to
pass events through.
* src/qemu/qemu_process.c (qemuProcessHandleEvent): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Any source file which calls the logging APIs now needs
to have a VIR_LOG_INIT("source.name") declaration at
the start of the file. This provides a static variable
of the virLogSource type.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We have to explicitly destroy TAP devices on FreeBSD because
they're not freed after being closed, otherwise we end up with
orphaned TAP devices after destroying a domain.
Change any method names with Usb, Pci or Scsi to use
USB, PCI and SCSI since they are abbreviations.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
For extracting hostdev codes from qemu_hostdev.c to common library, change qemu
specific COLD_BOOT handling to be a flag, and pass it to hostdev functions.
For extracting hostdev codes from qemu_hostdev.c to common library, change qemu
specific cfg->relaxedACS handling to be a flag, and pass it to hostdev
functions.
When attaching to a QEMU process, the def->seclabels array is
going to be empty. The qemuProcessAttach method must thus
populate it with data for the security drivers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qemu_bridge_filter.c file had some helpers for calling
the ebtablesXXX functions todo bridge filtering. The only
thing these helpers did was to overwrite the original error
message from the ebtables code. For added fun, the callers
of these helpers overwrote the errors yet again. For even
more fun, one of the helpers called another helper and
overwrite its errors too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There might be some use cases, where user wants to prepare the host or
its environment prior to starting a network and do some cleanup after
the network has been shut down. Consider all the functionality that
libvirt doesn't currently have as an example what a hook script can
possibly do.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On some platforms like IBM PowerNV the NUMA node numbers can be
non-sequential. For eg. numactl --hardware o/p from such a machine looks
as given below
node distances:
node 0 1 16 17
0: 10 40 40 40
1: 40 10 40 40
16: 40 40 10 40
17: 40 40 40 10
The NUMA nodes are 0,1,16,17
Libvirt uses sequential index as NUMA node numbers and this can
result in crash or incorrect results.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
The code took into account only the global permissions. The domains now
support per-vm DAC labels and per-image DAC labels. Use the most
specific label available.
The public virConnectRef and virConnectClose API are just thin
wrappers around virObjectRef/virObjectRef, with added object
validation and an error reset. Within our backend drivers, use
of the object validation is just an inefficiency since we always
pass valid objects. More important to think about is what
happens with the error reset; our uses of virConnectRef happened
to be safe (since we hadn't encountered any earlier errors), but
in several cases the use of virConnectClose could lose a real
error.
Ideally, we should also avoid calling virConnectOpen() from
within backend drivers - but that is a known situation that
needs much more design work.
* src/qemu/qemu_process.c (qemuProcessReconnectHelper)
(qemuProcessReconnect): Avoid nested public API call.
* src/qemu/qemu_driver.c (qemuAutostartDomains)
(qemuStateInitialize, qemuStateStop): Likewise.
* src/qemu/qemu_migration.c (doPeer2PeerMigrate): Likewise.
* src/storage/storage_driver.c (storageDriverAutostart):
Likewise.
* src/uml/uml_driver.c (umlAutostartConfigs): Likewise.
* src/lxc/lxc_process.c (virLXCProcessAutostartAll): Likewise.
(virLXCProcessReboot): Likewise, and avoid leaking conn on error.
Signed-off-by: Eric Blake <eblake@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1047659
If a VM dies very early during an attempted connect to the guest agent
while the locks are down the domain monitor object will be freed. The
object is then accessed later as any failure during guest agent startup
isn't considered fatal.
In the current upstream version this doesn't lead to a crash as
virObjectLock called when entering the monitor in
qemuProcessDetectVcpuPIDs checks the pointer before attempting to
dereference (lock) it. The NULL pointer is then caught in the monitor
helper code.
Before the introduction of virObjectLockable - observed on 0.10.2 - the
pointer is locked directly via virMutexLock leading to a crash.
To avoid this problem we need to differentiate between the guest agent
not being present and the VM quitting when the locks were down. The fix
reorganizes the code in qemuConnectAgent to add the check and then adds
special handling to the callers.
Currently, the qemuProcessStop tries to open the domain log file
and saves the original error afterwards. Then all the cleanup is
done after which the error is restored back. This has however one
flaw: if opening of the log file fails an error is reported,
which results in previous error being overwritten (the useful
one, e.g. "PCI device XXXX:XXXX could not be found"). Hence, user
sees something like:
error: failed to create logfile /var/log/libvirt/qemu/ovirt_usb.log: No such file or directory
instead of:
error: internal error: Did not find USB device 8644:8003
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reported-by: Zhou Yimin <zhouyimin@huawei.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1035955
There's a window when starting a qemu process between fork() and exec()
during which we are doing things that may fail but not tunnelling the
error to the daemon. This is basically all within qemuProcessHook().
So whenever we fail in something, e.g. placing a process onto numa node,
users are left with:
error: Child quit during startup handshake: Input/output error
while the original error is thrown into the domain log:
libvirt: error : internal error: NUMA memory tuning in 'preferred'
mode only supports single node
Hence, we should read the log file and search for the error message and
report it to users.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Report the error in virPortAllocatorAcquire instead
of doing it in every caller.
The error contains the port range name instead of the intended
use for the port, e.g.:
Unable to find an unused port in range 'display' (65534-65535)
instead of:
Unable to find an unused port for SPICE
This also adds error reporting when the QEMU driver could not
find an unused port for VNC, VNC WebSockets or NBD migration.
In the qemuProcessReconnectHelper() a new thread that does all the
interesting work is spawned. The rationale is to not block the daemon
startup process in case of unresponsive qemu. However, the thread
handler is a local variable which gets lost once the control goes out of
scope. Hence the thread gets leaked. We can avoid this if the thread
isn't made joinable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>