Remove all use of the existing APIs for querying QEMU
capability flags. Instead obtain a qemuCapsPtr object
from the global cache. This avoids the execution of
'qemu -help' (and related commands) when launching new
guests.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the qemuAgentClose method is called from a place which holds
the domain lock, it is theoretically possible to get a deadlock
in the agent destroy callback. This has not been observed, but
the equivalent code in the QEMU monitor destroy callback has seen
a deadlock.
Remove the redundant locking while unrefing the object and the
bogus assignment
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some users report (very rarely) seeing a deadlock in the QEMU
monitor callbacks
Thread 10 (Thread 0x7fcd11e20700 (LWP 26753)):
#0 0x00000030d0e0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000030d0e09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000030d0e09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007fcd162f416d in virMutexLock (m=<optimized out>)
at util/threads-pthread.c:85
#4 0x00007fcd1632c651 in virDomainObjLock (obj=<optimized out>)
at conf/domain_conf.c:14256
#5 0x00007fcd0daf05cc in qemuProcessHandleMonitorDestroy (mon=0x7fcccc0029e0,
vm=0x7fcccc00a850) at qemu/qemu_process.c:1026
#6 0x00007fcd0db01710 in qemuMonitorDispose (obj=0x7fcccc0029e0)
at qemu/qemu_monitor.c:249
#7 0x00007fcd162fd4e3 in virObjectUnref (anyobj=<optimized out>)
at util/virobject.c:139
#8 0x00007fcd0db027a9 in qemuMonitorClose (mon=<optimized out>)
at qemu/qemu_monitor.c:860
#9 0x00007fcd0daf61ad in qemuProcessStop (driver=driver@entry=0x7fcd04079d50,
vm=vm@entry=0x7fcccc00a850,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED, flags=flags@entry=0)
at qemu/qemu_process.c:4057
#10 0x00007fcd0db323cf in qemuDomainDestroyFlags (dom=<optimized out>,
flags=<optimized out>) at qemu/qemu_driver.c:1977
#11 0x00007fcd1637ff51 in virDomainDestroyFlags (
domain=domain@entry=0x7fccf00c1830, flags=1) at libvirt.c:2256
At frame #10 we are holding the domain lock, we call into
qemuProcessStop() to cleanup QEMU, which triggers the monitor
to close, which invokes qemuProcessHandleMonitorDestroy() which
tries to obtain the domain lock again. This is a non-recursive
lock, hence hang.
Since qemuMonitorPtr is a virObject, the unref call in
qemuProcessHandleMonitorDestroy no longer needs mutex
protection. The assignment of priv->mon = NULL, can be
instead done by the caller of qemuMonitorClose(), thus
removing all need for locking.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If QEMU quits immediately after we opened the monitor it was
possible we would skip the clearing of the SELinux process
socket context
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In the cgroups APIs we have a virCgroupKillPainfully function
which does the loop sending SIGTERM, then SIGKILL and waiting
for the process to exit. There is similar functionality for
simple processes in qemuProcessKill, but it is tangled with
the QEMU code. Untangle it to provide a virProcessKillPainfuly
function
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are a number of process related functions spread
across multiple files. Start to consolidate them by
creating a virprocess.{c,h} file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
Currently, we mark domain PAUSED (but not emit an event)
just before we issue 'stop' on monitor; This command can
take ages to finish, esp. when domain's doing a lot of
IO - users can enforce qemu to open files with O_DIRECT
which doesn't return from write() until data reaches the
block device. Having said that, we report PAUSED even if
domain is not paused yet.
On agent EOF the qemuProcessHandleAgentEOF() callback is called
which locks virDomainObjPtr. Then qemuAgentClose() is called
(with domain object locked) which eventually calls qemuAgentDispose()
and qemuProcessHandleAgentDestroy(). This tries to lock the
domain object again. Hence the deadlock.
The current qemu capabilities are stored in a virBitmapPtr
object, whose type is exposed to callers. We want to store
more data besides just the flags, so we need to move to a
struct type. This object will also need to be reference
counted, since we'll be maintaining a cache of data per
binary. This change introduces a 'qemuCapsPtr' virObject
class. Most of the change is just renaming types and
variables in all the callers
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When reboot using qemu guest agent was requested, qemu driver kept
waiting for SHUTDOWN event from qemu. However, such event is never
emitted during guest reboot and qemu driver would keep waiting forever.
After fixing the last review comments on remote port searching (commit
a14b4aea512d6c3a42af56207a65ef10ac4a12a1), the commit right after that
wasn't modified accordingly, therefore two values weren't changed as
they should and the configurable ports don't work as expected.
This simple commit changes last two values missed and fixes the issue.
In my quest for reusing variables I failed to edit one variable when
fixing details between two patch versions. That results in a failure
to start qemu with autoport and spice tls, because qemu is trying to
bind two sockets to the same port.
Emulator threads should also be pinned by sched_setaffinity(), just
the same as vcpu threads.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Create a new cgroup and move all emulator threads to the new cgroup.
And then we can do the other things:
1. limit only vcpu usage rather than the whole qemu
2. limit for emulator threads(include vhost-net threads)
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
After the cleanup of remote display port allocation, I noticed some
messages that didn't make a lot of sense the way they were written. So
I rephrased them.
The defines QEMU_REMOTE_PORT_MIN and QEMU_REMOTE_PORT_MAX were used to
find free port when starting domains. As this was hard-coded to the
same ports as default VNC servers, there were races with these other
programs. This patch includes the possibility to change the default
starting port as well as the maximum port (mostly for completeness) in
qemu config file.
Support for two new config options in qemu.conf is added:
- remote_port_min (defaults to QEMU_REMOTE_PORT_MIN and
must be >= than this value)
- remote_port_max (defaults to QEMU_REMOTE_PORT_MAX and
must be <= than this value)
Port allocations for SPICE and VNC behave almost the same (with
default ports), but there is some mess in the code. This patch clears
these inconsistencies and makes sure the same behavior will be used
when ports for remote displays are changed.
Changes:
- hard-coded number 5900 removed (handled elsewhere like with VNC)
- reservedVNCPorts renamed to reservedRemotePorts (it's not just for
VNC anymore)
- QEMU_VNC_PORT_{MIN,MAX} renamed to QEMU_REMOTE_PORT_{MIN,MAX}
- port allocation unified for VNC and SPICE
These changes make the security drivers able to find and handle the
correct security label information when more than one label is
available. They also update the DAC driver to be used as an usual
security driver.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This patch updates the structures that store information about each
domain and each hypervisor to support multiple security labels and
drivers. It also updates all the remaining code to use the new fields.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
Rename qemuDefaultScsiControllerModel to qemuCheckScsiControllerModel.
When scsi model is given explicitly in XML(model > 0) checking if the
underlying QEMU supports it or not first, raise an error on checking
failure.
When the model is not given(mode <= 0), return LSI by default, if
the QEMU doesn't support it, raise an error.
Switch virDomainObjPtr to use the virObject APIs for reference
counting. The main change is that virObjectUnref does not return
the reference count, merely a bool indicating whether the object
still has any refs left. Checking the return value is also not
mandatory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This converts the following public API datatypes to use the
virObject infrastructure:
virConnectPtr
virDomainPtr
virDomainSnapshotPtr
virInterfacePtr
virNetworkPtr
virNodeDevicePtr
virNWFilterPtr
virSecretPtr
virStreamPtr
virStorageVolPtr
virStoragePoolPtr
The code is significantly simplified, since the mutex in the
virConnectPtr object now only needs to be held when accessing
the per-connection virError object instance. All other operations
are completely lock free.
* src/datatypes.c, src/datatypes.h, src/libvirt.c: Convert
public datatypes to use virObject
* src/conf/domain_event.c, src/phyp/phyp_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/storage/storage_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
tests/qemuxml2argvtest.c, tests/qemuxmlnstest.c,
tests/sexpr2xmltest.c, tests/xmconfigtest.c: Convert
to use virObjectUnref/virObjectRef
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
The previous check for YAJL would have many undesirable
consequences, the most important being that it caused the
capabilities XML to lose all <guest> elements. There is
no user visible feedback as to what is wrong in this respect,
merely a syslog message. The empty capabilities causes
libvirtd to then throw away all guest XML configs that are
stored.
This changes the code so that the check for YAJL is only
performed at the time we attempt to spawn a QEMU process
error: Failed to start domain vm-vnc
error: unsupported configuration: this qemu binary requires libvirt to be compiled with yajl
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce new members in the virMacAddr 'class'
- virMacAddrSet: set virMacAddr from a virMacAddr
- virMacAddrSetRaw: setting virMacAddr from raw 6 byte MAC address buffer
- virMacAddrGetRaw: writing virMacAddr into raw 6 byte MAC address buffer
- virMacAddrCmp: comparing two virMacAddr
- virMacAddrCmpRaw: comparing a virMacAddr with a raw 6 byte MAC address buffer
then replace raw MAC addresses by replacing
- 'unsigned char *' with virMacAddrPtr
- 'unsigned char ... [VIR_MAC_BUFLEN]' with virMacAddr
and introduce usage of above functions where necessary.
If QEMU supports the BALLOON_EVENT QMP event, then we can
avoid invoking 'query-balloon' when returning XML or the
domain info.
* src/qemu/qemu_capabilities.c, src/qemu/qemu_capabilities.h:
Add QEMU_CAPS_BALLOON_EVENT
* src/qemu/qemu_driver.c: Skip query-balloon in
qemudDomainGetInfo and qemuDomainGetXMLDesc if we have
QEMU_CAPS_BALLOON_EVENT set
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h: Check
for BALLOON_EVENT at connect to monitor. Add callback
for balloon change notifications
* src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h:
Add handling of BALLOON_EVENT and impl 'query-events'
check
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This is in preparation of the enablement of s390 guests with virtio devices.
The assignment of device addresses happens in different places, i.e. the
qemu driver and process modules as well as in the unit tests in slightly
different flavors. Currently, these are PPC spapr-vio and PCI
devices, virtio-s390 (not PCI based) will follow.
By optionally passing to qemuDomainAssignAddresses the domain
object and the capabilities it is now possible to call the function
from most of the places (except for hotplug) where address assignment
is done.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
If no 'listen' attribute or <listen> element is set in the
guest XML, the default driver configured listen address is
used. There is no way to client applications to determine
what this address is though. When starting the guest, we
should update the live XML to include this default listen
address
With latest changes to qemu-ga success on some commands is not reported
anymore, e.g. guest-shutdown or guest-suspend-*. However, errors are
still being reported. Therefore, we need to find different source of
indication if operation was successful. Events.
A core use case of the hook scripts is to be able to do things
to a guest's network configuration. It is possible to hook into
the 'start' operation for a QEMU guest which runs just before
the guest is started. The TAP devices will exist at this point,
but the QEMU process will not. It can be desirable to have a
'started' hook too, which runs once QEMU has started.
If libvirtd is restarted it will re-populate firewall rules,
but there is no QEMU hook to trigger for existing domains.
This is solved with a 'reconnect' hook.
Finally, if attaching to an external QEMU process there needs
to be an 'attach' hook script.
This all also applies to the LXC driver
* docs/hooks.html.in: Document new operations
* src/util/hooks.c, src/util/hooks.c: Add 'started', 'reconnect'
and 'attach' operations for QEMU. Add 'prepare', 'started',
'release' and 'reconnect' operations for LXC
* src/lxc/lxc_driver.c: Add hooks for 'prepare', 'started',
'release' and 'reconnect' operations
* src/qemu/qemu_process.c: Add hooks for 'started', 'reconnect'
and 'reconnect' operations
Currently, if qemuProcessStart fail at some point, e.g. because
domain being started wants a PCI/USB device already assigned to
a different domain, we jump to cleanup label where qemuProcessStop
is performed. This unconditionally calls virSecurityManagerRestoreAllLabel
which is wrong because the other domain is still using those devices.
However, once we successfully label all devices/paths in
qemuProcessStart() from that point on, we have to perform a rollback
on failure - that is - we have to virSecurityManagerRestoreAllLabel.
When libvirtd is started and there is an unusable/not-connectable
leftover from earlier started machine, it's more reasonable to say
that the machine "crashed" if we know it was started with
"-no-shutdown".
This patch fixes that and also changes the other result (when machine
was started without "-no-shutdown") to "unknown", because the previous
"failed" reason means (according to include/libvirt/libvirt.h.in:174),
that the machine failed to start.
Like for 'static' placement, when the memory policy mode is
'strict', set the memory policy by writing the advisory nodeset
returned from numad to cgroup file cpuset.mems,
On some of the NUMA platforms, the CPU index in each NUMA node
grows non-consecutive. While on other platforms, it can be inconsecutive,
E.g.
% numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28
node 0 size: 131058 MB
node 0 free: 86531 MB
node 1 cpus: 1 5 9 13 17 21 25 29
node 1 size: 131072 MB
node 1 free: 127070 MB
node 2 cpus: 2 6 10 14 18 22 26 30
node 2 size: 131072 MB
node 2 free: 127758 MB
node 3 cpus: 3 7 11 15 19 23 27 31
node 3 size: 131072 MB
node 3 free: 127226 MB
node distances:
node 0 1 2 3
0: 10 20 20 20
1: 20 10 20 20
2: 20 20 10 20
3: 20 20 20 10
This patch is to fix the problem by using the CPU index in
caps->host.numaCell[i]->cpus[i] to set the bitmask instead of
assuming the CPU index of the NUMA nodes are always sequential.
This patch lifts the limit of calling thread detection code only on KVM
guests. With upstream qemu the thread mappings are reported also on
non-KVM machines.
QEMU adopted the thread_id information from the kvm branch.
To remain compatible with older upstream versions of qemu the check is
attempted but the failure to detect threads (or even run the monitor
command - on older versions without SMP support) is treated non-fatal
and the code reports one vCPU with pid of the hypervisor (in same
fashion this was done on non-KVM guests).