When both kvmclock and kvm_pv_eoi are configured (either disabled or
enabled) libvirt will generate invalid CPU specification due to the
fact that even though kvmclock causes the CPU to be specified, it
doesn't set have_cpu flag to true (and the new kvm_pv_eoi as well).
This patch fixes the issue and adds a test exactly for that to show
that it is fixed correctly (and also to keep it that way in the future
of course).
(cherry picked from commit 5d692cc714)
libcurl uses a SIGALRM in combination with sigsetjmp/siglongjmp to be
able to abort a DNS lookup when it takes too long. The problem with this
in a multi-threaded application is that the signal handler for SIGALRM
and the call to siglongjmp can be executed on a thread that is different
from the one that initially did the SIGALRM setup and the call to
sigsetjmp. In the reported case this triggered a segfault.
Disable libcurl's use of signals to avoid this situation. This has the
disadvantage of losing the ability to abort synchronous DNS lookups which
might result in libcurl getting stuck in a DNS lookup in the worst case.
When libcurl was build with an asynchronous DNS backend such as c-ares
then there is no problem because the timeout mechanism works without
signals here anyway.
Reported by Benjamin Wang.
(cherry picked from commit 0821ea6b3c)
The output buffer for virFileReadAll was too small for systems with
more than 30 CPUs which leads to a log entry and incorrect behavior.
The new size will be sufficient for the current
architectural limits.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
(cherry picked from commit 4bdc8606e6)
Correct the check for the return value of virStrcpyStatic()
when copying port-profile names. Fixes Open vSwitch ports
which utilize port-profiles from network definitions.
Signed-off-by: Kyle Mestery <kmestery@cisco.com>
(cherry picked from commit 83aebf6de4)
When doing snapshots, the filesystem freeze function used the agent
entering function that expects the qemud_driver unlocked. This might
cause a deadlock of the qemu driver if the agent does not respond.
The only call path of this function has the qemud_driver locked, so this
patch changes the entering functions to those expecting the driver
locked.
(cherry picked from commit e0316b5ebd)
There was an inverted return value in lxcCgroupControllerActive().
The function assumes cgroups are active and do couple of checks
to prove that. If any of them fails, false is returned. Therefore,
at the end, after all checks are done we must return true, not false.
(cherry picked from commit 0dddd680c2)
Commit f1a43a8 missed one side of an #if/#else.
* src/util/processinfo.c (virProcessInfoGetAffinity): Use correct
bitmap operation.
(cherry picked from commit 9038ac65da)
Minimal CPU "parser" for armhf to avoid compile time warning.
Signed-off-by: Chuck Short <chuck.short@canonical.com>
(cherry picked from commit 2d0a777b3d)
In Xen 4.2, xs.h is deprecated in favor of xenstore.h. xs.h now
contains
#warning xs.h is deprecated use xenstore.h instead
#include <xenstore.h>
which fails compilation when warnings are treated as errors.
Introduce a configure-time check for xenstore.h and if found,
use it instead of xs.h.
(cherry picked from commit 416eca189b)
For historical compat we use 'itanium' as the arch name, so
if the QEMU binary suffix is 'ia64' we need to translate it
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 3887afbb6b)
If the qemuAgentClose method is called from a place which holds
the domain lock, it is theoretically possible to get a deadlock
in the agent destroy callback. This has not been observed, but
the equivalent code in the QEMU monitor destroy callback has seen
a deadlock.
Remove the redundant locking while unrefing the object and the
bogus assignment
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 362d04779c)
Some users report (very rarely) seeing a deadlock in the QEMU
monitor callbacks
Thread 10 (Thread 0x7fcd11e20700 (LWP 26753)):
#0 0x00000030d0e0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000030d0e09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000030d0e09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007fcd162f416d in virMutexLock (m=<optimized out>)
at util/threads-pthread.c:85
#4 0x00007fcd1632c651 in virDomainObjLock (obj=<optimized out>)
at conf/domain_conf.c:14256
#5 0x00007fcd0daf05cc in qemuProcessHandleMonitorDestroy (mon=0x7fcccc0029e0,
vm=0x7fcccc00a850) at qemu/qemu_process.c:1026
#6 0x00007fcd0db01710 in qemuMonitorDispose (obj=0x7fcccc0029e0)
at qemu/qemu_monitor.c:249
#7 0x00007fcd162fd4e3 in virObjectUnref (anyobj=<optimized out>)
at util/virobject.c:139
#8 0x00007fcd0db027a9 in qemuMonitorClose (mon=<optimized out>)
at qemu/qemu_monitor.c:860
#9 0x00007fcd0daf61ad in qemuProcessStop (driver=driver@entry=0x7fcd04079d50,
vm=vm@entry=0x7fcccc00a850,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED, flags=flags@entry=0)
at qemu/qemu_process.c:4057
#10 0x00007fcd0db323cf in qemuDomainDestroyFlags (dom=<optimized out>,
flags=<optimized out>) at qemu/qemu_driver.c:1977
#11 0x00007fcd1637ff51 in virDomainDestroyFlags (
domain=domain@entry=0x7fccf00c1830, flags=1) at libvirt.c:2256
At frame #10 we are holding the domain lock, we call into
qemuProcessStop() to cleanup QEMU, which triggers the monitor
to close, which invokes qemuProcessHandleMonitorDestroy() which
tries to obtain the domain lock again. This is a non-recursive
lock, hence hang.
Since qemuMonitorPtr is a virObject, the unref call in
qemuProcessHandleMonitorDestroy no longer needs mutex
protection. The assignment of priv->mon = NULL, can be
instead done by the caller of qemuMonitorClose(), thus
removing all need for locking.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 25f582e36a)
If QEMU quits immediately after we opened the monitor it was
possible we would skip the clearing of the SELinux process
socket context
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 0b62c0736a)
When calling qemuProcessKill from the virDomainDestroy impl
in QEMU, do not ignore the return value. This ensures that
if QEMU fails to respond to SIGKILL, the caller will know
about the failure.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit f1b4021b38)
Depending on the scenario in which LXC containers exit, it is
possible for the EOF callback of the LXC monitor to deadlock
the driver.
#0 0x00000038a0a0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000038a0a09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000038a0a09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f4bd9579d55 in virMutexLock (m=<optimized out>) at util/threads-pthread.c:85
#4 0x00007f4bcacc7597 in lxcDriverLock (driver=0x7f4bc40c8290) at lxc/lxc_conf.h:81
#5 virLXCProcessMonitorEOFNotify (mon=<optimized out>, vm=0x7f4bb4000b00) at lxc/lxc_process.c:581
#6 0x00007f4bd9645c91 in virNetClientCloseLocked (client=client@entry=0x7f4bb4009e60)
at rpc/virnetclient.c:554
#7 0x00007f4bd96460f8 in virNetClientIOEventLoopPassTheBuck (thiscall=0x0, client=0x7f4bb4009e60)
at rpc/virnetclient.c:1306
#8 virNetClientIOEventLoopPassTheBuck (client=0x7f4bb4009e60, thiscall=0x0)
at rpc/virnetclient.c:1287
#9 0x00007f4bd96467a2 in virNetClientCloseInternal (reason=3, client=0x7f4bb4009e60)
at rpc/virnetclient.c:589
#10 virNetClientCloseInternal (client=0x7f4bb4009e60, reason=3) at rpc/virnetclient.c:561
#11 0x00007f4bcacc4a82 in virLXCMonitorClose (mon=0x7f4bb4000a00) at lxc/lxc_monitor.c:201
#12 0x00007f4bcacc55ac in virLXCProcessCleanup (reason=<optimized out>, vm=0x7f4bb4000b00,
driver=0x7f4bc40c8290) at lxc/lxc_process.c:240
#13 virLXCProcessStop (driver=0x7f4bc40c8290, vm=vm@entry=0x7f4bb4000b00,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:735
#14 0x00007f4bcacc5bd2 in virLXCProcessAutoDestroyDom (payload=<optimized out>,
name=0x7f4bb4003c80, opaque=0x7fff41af2df0) at lxc/lxc_process.c:94
#15 0x00007f4bd9586649 in virHashForEach (table=0x7f4bc409b270,
iter=iter@entry=0x7f4bcacc5ab0 <virLXCProcessAutoDestroyDom>, data=data@entry=0x7fff41af2df0)
at util/virhash.c:514
#16 0x00007f4bcacc52d7 in virLXCProcessAutoDestroyRun (driver=driver@entry=0x7f4bc40c8290,
conn=conn@entry=0x7f4bb8000ab0) at lxc/lxc_process.c:120
#17 0x00007f4bcacca628 in lxcClose (conn=0x7f4bb8000ab0) at lxc/lxc_driver.c:128
#18 0x00007f4bd95e67ab in virReleaseConnect (conn=conn@entry=0x7f4bb8000ab0) at datatypes.c:114
When the driver calls virLXCMonitorClose, there is really no
need for the EOF callback to be invoked in this case, since
the caller can easily handle events itself. In changing this,
the monitor needs to take a deep copy of the callback list,
not merely a reference.
Also adds debug statements in various places to aid
troubleshooting
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 36c1fc189d)
Xen upstream c/s 24102:dc8e55c9 bumped the sysctl version to 9.
Support this sysctl version in the xen_hypervisor sub-driver.
(cherry picked from commit 371ddc9866)
Jim Fehlig reported a compilation error with older gcc 4.3.4:
libvirt.c: In function 'virDomainGetEmulatorPinInfo':
libvirt.c:9111: error: logical '&&' with non-zero constant will always evaluate as true [-Wlogical-op]
It looks like someone programmed via too much copy-and-paste.
* src/libvirt.c (virDomainGetEmulatorPinInfo): Multiplying by 1 is
a no-op, and thus will never overflow.
(cherry picked from commit 3da355e8c4)
SELinux wants all log files opened with O_APPEND. When
running non-root though, libvirtd likes to use O_TRUNC
to avoid log files growing in size indefinitely. Instead
of using O_TRUNC though, we can use O_APPEND and then
call ftruncate() which keeps SELinux happier.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 639d5c4966)
There is no need to hold the mutex when unref'ing
virObject instances
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 7307c3c00c)
Asynchronously setting priv->mon to NULL was pointless,
just remove the destroy callback entirely.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit dd0371764f)
Remove custom reference counting from virLXCMonitor, using
virObject instead
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 09e0cb4218)
Continue consolidation of process functions by moving some
helpers out of command.{c,h} into virprocess.{c,h}
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 9467ab6074)
There are a number of process related functions spread
across multiple files. Start to consolidate them by
creating a virprocess.{c,h} file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit e5e2b65cf8)
The virCommand prefix was inappropriate because the API
does not use any virCommandPtr object instance. This
API closely related to waitpid/exit, so use virProcess
as the prefix
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 49ecf8b41f)
Change "Pid" to "Process" to align with the virProcessKill
API naming prefix
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit 0fb58ef5cd)
Changing naming to follow the convention of "object" followed
by "action"
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit cf470068a1)
A prefix change to unmount the SELinux filesystem broke starting
of LXC containers with a custom root filesystem
(cherry picked from commit 1532bd498a)
Every level of the code for virNetworkUpdate was assuming that some
other level was checking for validity of the "command" arg, but none
actually were. The result was that an invalid command code would do
nothing, but also report success.
Since the command code isn't used until the very lowest level backend
functions, that's where I put the check. I made a separate one-line
function to log the error. The compiler would have combined the
identical strings used by multiple calls if I'd just called
virReportError directly in each location, but sending them all to the
same string in the source guards against inadvertant divergence (which
would lead to extra work for translators.)
1) virNetworkObjUpdate should be an all or none operation, but in the
case that we want to update both the live state and persistent config
versions of the network, it was committing the update to the live
state before starting to update the persistent config. If update of
the persistent config failed, we would leave with things in an
inconsistent state - the live state would be updated (even though an
error was returned), but persistent config unchanged.
This patch changed virNetworkObjUpdate to use a separate pointer for
each copy of the virNetworkDef, and not commit either of them in the
virNetworkObj until both live and config parts of the update have
successfully completed.
2) The parsers for various pieces of the virNetworkDef have all sorts
of subtle limitations on them that may not be known by the
Update[section] function, making it possible for one of these
functions to make a modification directly to the object that may not
pass the scrutiny of a subsequent parse. But normally another parse
wouldn't be done on the data until the *next* time the object was
updated (which could leave the network definition in an unusable
state).
Rather than fighting the losing battle of trying to duplicate all the
checks from the parsers into the update functions as well, the more
foolproof solution to this is to simply do an extra
virNetworkDefCopy() operation on the updated networkdef -
virNetworkDefCopy() does a virNetworkFormat() followed by a
virNetworkParseString(), so it will do all the checks we need. If this
fails, then we don't commit the changed def.
The bridge driver implementation of virNetworkUpdate() removes and
re-adds iptables rules any time a network has an <ip>, <forward>, or
<forward>/<interface> element updated. There are some types of
networks that have those elements and yet have no iptables rules
associated with them, and unfortunately the functions that remove/add
iptables rules don't check the type of network before attempting to
remove/add the rules, sometimes leading to an erroneous failure of the
entire update operation.
Under normal circumstances I would refactor the lower level functions
to be more robust, but to avoid code churn as much as possible, I've
just added extra checks directly to networkUpdate().
Nothing uses the return value, and creating it requries otherwise
unnecessary strlen () calls.
This cleanup is conceptually independent from the rest of the series
(although the later patches won't apply without it). This just seems
a good opportunity to clean this up, instead of entrenching the unnecessary
return value in the virLogOutputFunc instance that will be added in this
series.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
maxcpu and hostcpus are defined and calculated in qemudDomainPinVcpuFlags()
and qemudDomainPinEmulator(), but never used. So remove them including nodeinfo.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
This allows the user to control labelling of each character device
separately (the default is to inherit from the VM).
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
This is just code motion, allowing us to reuse the same function to
parse the <seclabel> from character devices too.
However it also fixes a possible segfault in the original code if
VIR_ALLOC_N returns an error and the cleanup code (at the error:
label) tries to iterate over the unallocated array (thanks Michal
Privoznik for spotting this).
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Disk hotplug is a two phase action: qemuMonitorAddDrive followed by
qemuMonitorAddDevice. When the first part succeeds but the second one
fails, we need to rollback the drive addition.
The README file seems to be a leftover from some previous version of
locking driver. It is not consistent with what the code does nor is it
consistent with existing documentation in internals/locking.html.
Some kernel versions (at least RHEL-6 2.6.32) do not let you over-mount
an existing selinuxfs instance with a new one. Thus we must unmount the
existing instance inside our namespace.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When auto-probing hypervisor drivers, the conn->uri field will
initially be NULL. Care must be taken not to access members
when doing auth lookups in the config file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
portgroup elements are located in the toplevel of <network>
objects. There can be multiple <portgroup> elements, and they each
have a unique name attribute.
Add, delete, and modify are all supported for portgroup. When deleting
a portgroup, only the name must be specified in the provided xml - all
other attributes and subelements are ignored for the purposes of
matching and existing portgroup.
The bridge driver and virsh already know about the portgroup element,
so providing this backend should cause the entire stack to work. Note
that in the case of portgroup, there is no external daemon based on
the portgroup config, so nothing must be restarted.
It is important to note that guests make a copy of the appropriate
network's portgroup data when they are started, so although an updated
portgroup's configuration will have an affect on new guests started
after the cahange, existing guests won't magically have their
bandwidth changed, for example. If something like that is desired, it
will take a lot of redesign work in the way network devices are setup
(there is currently no link from the network back to the individual
interfaces using it, much less from a portgroup within a network back
to the individual interfaces).
The dhcp range element is contained in the <dhcp> element of one of a
network's <ip> elements. There can be multiple <range>
elements. Because there are only two attributes (start and end), and
those are exactly what you would use to identify a particular range,
it doesn't really make sense to modify an existing element, so
VIR_NETWORK_UPDATE_COMMAND_MODIFY isn't supported for this section,
only ADD_FIRST, ADD_LAST, and DELETE.
Since virsh already has support for understanding all the defined
sections, this new backend is automatically supported by virsh. You
would use it like this:
virsh net-update mynet add ip-dhcp-range \
"<range start='1.2.3.4' end='1.2.3.20'/>" --live --config
The bridge driver also already supports all sections, so it's doing
the correct thing in this case as well - since the dhcp range is
placed on the dnsmasq commandline, the bridge driver recreates the
dnsmasq commandline, and re-runs dnsmasq whenever a range is
added/deleted (and AFFECT_LIVE is specified in the flags).
https://www.gnu.org/licenses/gpl-howto.html recommends that
the 'If not, see <url>.' phrase be a separate sentence.
* tests/securityselinuxhelper.c: Remove doubled line.
* tests/securityselinuxtest.c: Likewise.
* globally: s/; If/. If/
The "dump-guest-core' option is new option for the machine type
(-machine pc,dump-guest-core) that controls whether the guest memory
will be marked as dumpable.
While testing this, I've found out that the value for the '-M' options
is not parsed correctly when additional parameters are used. However,
when '-machine' is used for the same options, it gets parsed as
expected. That's why this patch also modifies the parsing and creating
of the command line, so both '-M' and '-machine' are recognized. In
QEMU's help there is only mention of the 'machine parameter now with
no sign of the older '-M'.
Sometimes when guest machine crashes, coredump can get huge due to the
guest memory. This can be limited using madvise(2) system call and is
being used in QEMU hypervisor. This patch adds an option for configuring
that in the domain XML and related documentation.