Using 'unsigned long' for memory values is risky on 32-bit platforms,
as a PAE guest can have more than 4GiB memory. Our API is
(unfortunately) locked at 'unsigned long' and a scale of 1024, but
the rest of our system should consistently use 64-bit values,
especially since the previous patch centralized overflow checking.
* src/conf/domain_conf.h (_virDomainDef): Always use 64-bit values
for memory. Change hugepage_backed to a bool.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefCheckABIStability, virDomainDefFormatInternal): Fix
clients.
* src/vmx/vmx.c (virVMXFormatConfig): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr, xenFormatSxpr): Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetULongLong): New function.
(xenXMConfigGetULong, xenXMConfigSetInt): Avoid truncation.
(xenParseXM, xenFormatXM): Fix clients.
* src/phyp/phyp_driver.c (phypBuildLpar): Likewise.
* src/openvz/openvz_driver.c (openvzDomainSetMemoryInternal):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDefineXML): Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* src/qemu/qemu_monitor.h (qemuMonitorGetBalloonInfo): Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainGetXMLDesc): Likewise.
* src/uml/uml_conf.c (umlBuildCommandLine): Likewise.
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'. Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.
Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.
* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
* src/qemu/qemu_process.c (qemuFindAgentConfig): avoid crash libvirtd due to
deref a NULL pointer.
* How to reproduce?
1. virsh edit the following xml into guest configuration:
<channel type='pty'>
<target type='virtio'/>
</channel>
2. virsh start <domain>
or
% virt-install -n foo -r 1024 --disk path=/var/lib/libvirt/images/foo.img,size=1 \
--channel pty,target_type=virtio -l <installation tree>
Signed-off-by: Alex Jia <ajia@redhat.com>
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
or if libvirt should auto-generate the interfaceid use
following syntax:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
It is also possible to pass an optional profileid. To do that
use following syntax:
<interface type='bridge'>
<source bridge='ovsbr'/>
<mac address='00:55:1a:65:a2:8d'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
profileid='test-profile'/>
</virtualport>
</interface>
To create Open vSwitch bridge install Open vSwitch and
run the following command:
ovs-vsctl add-br ovsbr
The current default method of terminating the qemu process is to send
a SIGTERM, wait for up to 1.6 seconds for it to cleanly shutdown, then
send a SIGKILL and wait for up to 1.4 seconds more for the process to
terminate. This is problematic because occasionally 1.6 seconds is not
long enough for the qemu process to flush its disk buffers, so the
guest's disk ends up in an inconsistent state.
Since this only occasionally happens when the timeout prior to SIGKILL
is 1.6 seconds, this patch increases that timeout to 10 seconds. At
the very least, this should reduce the occurrence from "occasionally"
to "extremely rarely". (Once SIGKILL is sent, it waits another 5
seconds for the process to die before returning).
Note that in the cases where it takes less than this for qemu to
shutdown cleanly, libvirt will *not* wait for any longer than it would
without this patch - qemuProcessKill polls the process and returns as
soon as it is gone.
This patch is based on an earlier patch by Eric Blake which was never
committed:
https://www.redhat.com/archives/libvir-list/2011-November/msg00243.html
Aside from rebasing, this patch only drops the driver lock once (prior
to the first time the function sleeps), then leaves it dropped until
it returns (Eric's patch would drop and re-acquire the lock around
each call to sleep).
At the time Eric sent his patch, the response (from Dan Berrange) was
that, while it wasn't a good thing to be holding the driver lock while
sleeping, we really need to rethink locking wrt the driver object,
switching to a finer-grained approach that locks individual items
within the driver object separately to allow for greater concurrency.
This is a good plan, and at the time it made sense to not apply the
patch because there was no known bug related to the driver lock being
held in this function.
However, we now know that the length of the wait in qemuProcessKill is
sometimes too short to allow the qemu process to fully flush its disk
cache before SIGKILL is sent, so we need to lengthen the timeout (in
order to improve the situation with management applications until they
can be updated to use the new VIR_DOMAIN_DESTROY_GRACEFUL flag added
in commit 72f8a7f19753506ed957b78ad800c0f3892c9304). But, if we
lengthen the timeout, we also lengthen the amount of time that all
other threads in libvirtd are essentially blocked from doing anything
(since just about everything needs to acquire the driver lock, if only
for long enough to get a pointer to a domain).
The solution is to modify qemuProcessKill to drop the driver lock
while sleeping, as proposed in Eric's patch. Then we can increase the
timeout with a clear conscience, and thus at least lower the chances
that someone running with existing management software will suffer the
consequence's of qemu's disk cache not being flushed.
In the meantime, we still should work on Dan's proposal to make
locking within the driver object more fine grained.
(NB: although I couldn't find any instance where qemuProcessKill() was
called with no jobs active for the domain (or some other guarantee
that the current thread had at least one refcount on the domain
object), this patch still follows Eric's method of temporarily adding
a ref prior to unlocking the domain object, because I couldn't
convince myself 100% that this was the case.)
In the future (my next patch in fact) we may want to make
decisions depending on qemu having a monitor command or not.
Therefore, we want to set qemuCaps flag instead of querying
on the monitor each time we are about to make that decision.
When libvirt's virDomainDestroy API is shutting down the qemu process,
it first sends SIGTERM, then waits for 1.6 seconds and, if it sees the
process still there, sends a SIGKILL.
There have been reports that this behavior can lead to data loss
because the guest running in qemu doesn't have time to flush its disk
cache buffers before it's unceremoniously whacked.
This patch maintains that default behavior, but provides a new flag
VIR_DOMAIN_DESTROY_GRACEFUL to alter the behavior. If this flag is set
in the call to virDomainDestroyFlags, SIGKILL will never be sent to
the qemu process; instead, if the timeout is reached and the qemu
process still exists, virDomainDestroy will return an error.
Once this patch is in, the recommended method for applications to call
virDomainDestroyFlags will be with VIR_DOMAIN_DESTROY_GRACEFUL
included. If that fails, then the application can decide if and when
to call virDomainDestroyFlags again without
VIR_DOMAIN_DESTROY_GRACEFUL (to force the issue with SIGKILL).
(Note that this does not address the issue of existing applications
that have not yet been modified to use VIR_DOMAIN_DESTROY_GRACEFUL.
That is a separate patch.)
This patch revises qemuProcessStart() function for qemu
processes to retain CAP_SYS_RAWIO if needed.
And in case of that, add taint flag to domain.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Shota Hirae <m11g1401@hibikino.ne.jp>
There is now a standard QEMU guest agent that can be installed
and given a virtio serial channel
<channel type='unix'>
<source mode='bind' path='/var/lib/libvirt/qemu/f16x86_64.agent'/>
<target type='virtio' name='org.qemu.guest_agent.0'/>
</channel>
The protocol that runs over the guest agent is JSON based and
very similar to the JSON monitor. We can't use exactly the same
code because there are some odd differences in the way messages
and errors are structured. The qemu_agent.c file is based on
a combination and simplification of qemu_monitor.c and
qemu_monitor_json.c
* src/qemu/qemu_agent.c, src/qemu/qemu_agent.h: Support for
talking to the agent for shutdown
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add thread
helpers for talking to the agent
* src/qemu/qemu_process.c: Connect to agent whenever starting
a guest
* src/qemu/qemu_monitor_json.c: Make variable static
When sVirt is integrated with the LXC driver, it will be neccessary
to invoke the security driver APIs using only a virDomainDefPtr
since the lxc_container.c code has no virDomainObjPtr available.
Aside from two functions which want obj->pid, every bit of the
security driver code only touches obj->def. So we don't need to
pass a virDomainObjPtr into the security drivers, a virDomainDefPtr
is sufficient. Two functions also gain a 'pid_t pid' argument.
* src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c,
src/security/security_dac.c,
src/security/security_driver.h,
src/security/security_manager.c,
src/security/security_manager.h,
src/security/security_nop.c,
src/security/security_selinux.c,
src/security/security_stack.c: Change all security APIs to use a
virDomainDefPtr instead of virDomainObjPtr
This *kind of* addresses:
https://bugzilla.redhat.com/show_bug.cgi?id=772395
(it doesn't eliminate the failure to start, but causes libvirt to give
a better idea about the cause of the failure).
If a guest uses a kvm emulator (e.g. /usr/bin/qemu-kvm) and the guest
is started when kvm isn't available (either because virtualization is
unavailable / has been disabled in the BIOS, or the kvm modules
haven't been loaded for some reason), a semi-cryptic error message is
logged:
libvirtError: internal error Child process (LC_ALL=C
PATH=/sbin:/usr/sbin:/bin:/usr/bin /usr/bin/qemu-kvm -device ? -device
pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?) status
unexpected: exit status 1
This patch notices at process start that a guest needs kvm, and checks
for the presence of /dev/kvm (a reasonable indicator that kvm is
available) before trying to execute the qemu binary. If kvm isn't
available, a more useful (too verbose??) error is logged.
This patch adds max_files option to qemu.conf which can be used to
override system default limit on number of opened files that are
allowed for qemu user.
When destroying a domain qemuDomainDestroy kills its qemu process and
starts a new job, which means it unlocks the domain object and locks it
again after some time. Although the object is usually unlocked for a
pretty short time, chances are another thread processing an EOF event on
qemu monitor is able to lock the object first and does all the cleanup
by itself. This leads to wrong shutoff reason and lifecycle event detail
and virDomainDestroy API incorrectly reporting failure to destroy an
inactive domain.
Reported by Charlie Smurthwaite.
A preparatory patch for DHCP snooping where we want to be able to
differentiate between a VM's interface using the tuple of
<VM UUID, Interface MAC address>. We assume that MAC addresses could
possibly be re-used between different networks (VLANs) thus do not only
want to rely on the MAC address to identify an interface.
At the current 'final destination' in virNWFilterInstantiate I am leaving
the vmuuid parameter as ATTRIBUTE_UNUSED until the DHCP snooping patches arrive.
(we may not post the DHCP snooping patches for 0.9.9, though)
Mostly this is a pretty trivial patch. On the lowest layers, in lxc_driver
and uml_conf, I am passing the virDomainDefPtr around until I am passing
only the VM's uuid into the NWFilter calls.
This patch cleans up return codes in the nwfilter subsystem.
Some functions in nwfilter_conf.c (validators and formatters) are
keeping their bool return for now and I am converting their return
code to true/false.
All other functions now have failure return codes of -1 and success
of 0.
[I searched for all occurences of ' 1;' and checked all 'if ' and
adapted where needed. After that I did a grep for 'NWFilter' in the source
tree.]
With security_driver set to "none" in /etc/libvirt/qemu.conf,
libvirtd would crash when attempted to attach to an existing
qemu process. Only copy the security model if it actually exists.
During virDomainDestroy, QEMU may emit SHUTDOWN event as a response to
SIGTERM and since domain object is still locked, the event is processed
after the domain is destroyed. We need to ignore this event in such case
to avoid changing domain state from shutoff to shutdown.
When QEMU guest finishes its shutdown sequence, qemu stops virtual CPUs
and when started with -no-shutdown waits for us to kill it using
SGITERM. Since QEMU is flushing its internal buffers, some time may pass
before QEMU actually dies. We mistakenly used "paused" state (and
events) for this which is quite confusing since users may see a domain
going to pause while they expect it to shutdown. Since we already have
"shutdown" state with "the domain is being shut down" semantics, we
should use it for this state.
However, the state didn't have a corresponding event so I created one
and called its detail as VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED (guest OS
finished its shutdown sequence) with the intent to add
VIR_DOMAIN_EVENT_SHUTDOWN_STARTED in the future if we have a
sufficiently capable guest agent that can notify us when guest OS starts
to shutdown.
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.
* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
virtime APIs
Now, when we support multiple consoles per domain,
the vm->def->console[0] can still remain an alias
for vm->def->serial[0]; However, we need to copy
it's source definition as well otherwise we'll regress
on virDomainOpenConsole.
Rename virNetDevMacVLanCreate to virNetDevMacVLanCreateWithVPortProfile
and virNetDevMacVLanDelete to virNetDevMacVLanDeleteWithVPortProfile
To make way for renaming the other macvlan creation APIs in
interface.c
* util/virnetdevmacvlan.c, util/virnetdevmacvlan.h,
qemu/qemu_command.c, qemu/qemu_hotplug.c, qemu/qemu_process.c:
Rename APIs
Rename the macvtap.c file to virnetdevmacvlan.c to reflect its
functionality. Move the port profile association code out into
virnetdevvportprofile.c. Make the APIs available unconditionally
to callers
* src/util/macvtap.h: rename to src/util/virnetdevmacvlan.h,
* src/util/macvtap.c: rename to src/util/virnetdevmacvlan.c
* src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h:
Pull in vport association code
* src/Makefile.am, src/conf/domain_conf.h, src/qemu/qemu_conf.c,
src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update include
paths & remove conditional compilation
In preparation for code re-organization, rename the Macvtap
management APIs to have the following patterns
virNetDevMacVLanXXXXX - macvlan/macvtap interface management
virNetDevVPortProfileXXXX - virtual port profile management
* src/util/macvtap.c, src/util/macvtap.h: Rename APIs
* src/conf/domain_conf.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_command.h,
src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/qemu/qemu_process.h: Update for renamed APIs
While Xen only has a single paravirt console, UML, and
QEMU both support multiple paravirt consoles. The LXC
driver can also be trivially made to support multiple
consoles. This patch extends the XML to allow multiple
<console> elements in the XML. It also makes the UML
and QEMU drivers support this config.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Allow
multiple <console> devices
* src/lxc/lxc_driver.c, src/xen/xen_driver.c,
src/xenxs/xen_sxpr.c, src/xenxs/xen_xm.c: Update for
internal API changes
* src/security/security_selinux.c, src/security/virt-aa-helper.c:
Only label consoles that aren't a copy of the serial device
* src/qemu/qemu_command.c, src/qemu/qemu_driver.c,
src/qemu/qemu_process.c, src/uml/uml_conf.c,
src/uml/uml_driver.c: Support multiple console devices
* tests/qemuxml2xmltest.c, tests/qemuxml2argvtest.c: Extra
tests for multiple virtio consoles. Set QEMU_CAPS_CHARDEV
for all console /channel tests
* tests/qemuxml2argvdata/qemuxml2argv-channel-virtio-auto.args,
tests/qemuxml2argvdata/qemuxml2argv-channel-virtio.args
tests/qemuxml2argvdata/qemuxml2argv-console-virtio.args: Update
for correct chardev syntax
* tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.args,
tests/qemuxml2argvdata/qemuxml2argv-console-virtio-many.xml: New
test file
Rather than making all clients of monitor commands that are JSON-only
check whether yajl support was compiled in, it is simpler to just
avoid setting the capability bit up front if we can't use the capability.
* src/qemu/qemu_capabilities.c (qemuCapsComputeCmdFlags): Only set
capability bit if we also have yajl library to use it.
* src/qemu/qemu_driver.c (qemuDomainReboot): Drop #ifdefs.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* tests/qemuhelptest.c (testHelpStrParsing): Pass test even
without yajl.
* tests/qemuxml2argvtest.c (mymain): Simplify use of json flag.
* tests/qemuxml2argvdata/qemuxml2argv-disk-drive-error-*.args:
Update expected results to match.
If a disk source gets dropped because it is not accessible,
mgmt application might want to be informed about this. Therefore
we need to emit an event. The event presented in this patch
is however a bit superset of what written above. The reason is simple:
an intention to be easily expanded, e.g. on 'user ejected disk
in guest' events. Therefore, callback gets source string and disk alias
(which should be unique among a domain) and reason (an integer);
This patch implements on_missing feature in qemu driver.
Upon qemu startup process an accessibility of CDROMs
and floppy disks is checked. The source might get dropped
if unavailable and on_missing is set accordingly.
No event is emit thought. Look for follow up patch.
This patch is rather cosmetic as it only moves device alias
assignation from command line construction just before that.
However, it is needed in connotation of previous and next patch.
If the daemon is restarted so we reconnect to monitor, cdrom media
can be ejected. In that case we don't want to show it in domain xml,
or require it on migration destination.
To check for disk status use 'info block' monitor command.
If a domain started with -no-shutdown shuts down while libvirtd is not
running, it will be seen as paused when libvirtd reconnects to it. Use
the paused reason to detect if a domain was stopped because of shutdown
and finish the process just as if a SHUTDOWN event is delivered from
qemu.
Commit 282fe1f0 documented that transient domains will auto-delete
any snapshot metadata when the last reference to the domain is
removed, and that management apps are in charge of grabbing any
snapshot metadata prior to that point. However, this was not
actually implemented for qemu until now.
* src/qemu/qemu_driver.c (qemudDomainCreate)
(qemuDomainDestroyFlags, qemuDomainSaveInternal)
(qemudDomainCoreDump, qemuDomainRestoreFlags, qemudDomainDefine)
(qemuDomainUndefineFlags, qemuDomainMigrateConfirm3)
(qemuDomainRevertToSnapshot): Clean up snapshot metadata.
* src/qemu/qemu_migration.c (qemuMigrationPrepareAny)
(qemuMigrationPerformJob, qemuMigrationPerformPhase)
(qemuMigrationFinish): Likewise.
* src/qemu/qemu_process.c (qemuProcessHandleMonitorEOF)
(qemuProcessReconnect, qemuProcessReconnectHelper)
(qemuProcessAutoDestroyDom): Likewise.
This patch is mostly code motion - moving some functions out
of qemu_driver and into qemu_domain so they can be reused by
multiple qemu_* files (since qemu_driver.h must not grow).
It also adds a new helper function, qemuDomainRemoveInactive,
which will be used in the next patch.
* src/qemu/qemu_domain.h (qemuFindQemuImgBinary)
(qemuDomainSnapshotWriteMetadata, qemuDomainSnapshotForEachQcow2)
(qemuDomainSnapshotDiscard, qemuDomainSnapshotDiscardAll)
(qemuDomainRemoveInactive): New prototypes.
(struct qemu_snap_remove): New struct.
* src/qemu/qemu_domain.c (qemuDomainRemoveInactive)
(qemuDomainSnapshotDiscardAllMetadata): New functions.
(qemuFindQemuImgBinary, qemuDomainSnapshotWriteMetadata)
(qemuDomainSnapshotForEachQcow2, qemuDomainSnapshotDiscard)
(qemuDomainSnapshotDiscardAll): Move here...
* src/qemu/qemu_driver.c (qemuFindQemuImgBinary)
(qemuDomainSnapshotWriteMetadata, qemuDomainSnapshotForEachQcow2)
(qemuDomainSnapshotDiscard, qemuDomainSnapshotDiscardAll): ...from
here.
(qemuDomainUndefineFlags): Update caller.
* src/conf/domain_conf.c (virDomainRemoveInactive): Doc fixes.
* src/qemu/qemu_process.c: Taking if (qemuDomainObjEndJob(driver, obj) == 0)
true branch then 'obj' is NULL, virDomainObjIsActive(obj) and
virDomainObjUnref(obj) will dereference NULL pointer.
Signed-off-by: Alex Jia <ajia@redhat.com>
Once virDomainReboot is called for a domain, guest OS initiated shutdown
would always result in reboot instead of shutdown. Only
virDomainShutdown would actually shutd such domain down. That's because
we forgot to reset fakeReboot flag once we asked the domain to reboot.
Qemu sends STOP event as part of the shutdown process. Detect such STOP
event and consider shutdown to be reason of emitting such event. That's
the best we can do until qemu provides us the reason directly in STOP
event. This allows us to report shutdown reason for paused state so that
apps can detect domains that failed to finish the shutdown process
(e.g., because qemu is buggy and doesn't exit on SIGTERM or it is
blocked in flushing disk buffers).
Ever since we introduced fake reboot, we call qemuProcessKill as a
reaction to SHUTDOWN event. Unfortunately, qemu doesn't guarantee it
flushed all internal buffers before sending SHUTDOWN, in which case
killing the process forcibly may result in (virtual) disk corruption.
By sending just SIGTERM without SIGKILL we give qemu time to to flush
all buffers and exit. Once qemu exits, we will see an EOF on monitor
connection and tear down the domain. In case qemu ignores SIGTERM or
just hangs there, the process stays running but that's not any different
from a possible hang anytime during the shutdown process so I think it's
just fine.
Also qemu (since 0.14 until it's fixed) has a bug in SIGTERM processing
which causes it not to exit but instead send new SHUTDOWN event and keep
waiting. I think the best we can do is to ignore duplicate SHUTDOWN
events to avoid a SHUTDOWN-SIGTERM loop and leave the domain in paused
state.
When a domain is rebooted using libvirt API, we use fake reboot
consisting of shutting down and resetting the domain. Thus we see a
SHUTDOWN event and set gotShutdown flag. But we never reset it back and
if the domain crashes after it was rebooted this way, we consider it was
a normal shutdown and not a crash.
Commit 4454a9efc728b91e791b1f14c26ea23a19d57f48 changed shutoff reason
from VIR_DOMAIN_SHUTOFF_CRASHED to VIR_DOMAIN_SHUTOFF_FAILED in case we
see an unexpected EOF on monitor connection. But FAILED reason is
dedicated for domains that fail to start. CRASHED reason is the right
one to use in this situation.
This patch fixes the bug shown in bugzilla 738778. It's not an nwfilter problem but a connection sharing / closure issue.
https://bugzilla.redhat.com/show_bug.cgi?id=738778
Depending on the speed / #CPUs of the machine you are using you may not see this bug all the time.
This patch enables modifying network device configuration using the
virUpdateDeviceFlags API method. Matching of devices is accomplished
using MAC addresses.
While updating live configuration of a running domain, the user is
allowed only to change link state of the interface. Additional
modifications may be added later. For now the code checks for
unsupported changes and thereafter changes the link state, if
applicable.
When updating persistent configuration of guest's network interface the
whole configuration (except for the MAC address) may be modified and
is stored for the next startup.
* src/qemu/qemu_driver.c - Add dispatching of virUpdateDevice for
network devices update (live/config)
* src/qemu/qemu_hotplug.c - add setting of initial link state on live
device addition
- add function to change network device
configuration. By now it supports only
changing of link state
* src/qemu/qemu_hotplug.h - Headers to above functions
* src/qemu/qemu_process.c - set link states before virtual machine
start. Qemu does not support setting of
this on the command line.
If libvirt daemon gets restarted and there is (at least) one
unresponsive qemu, the startup procedure hangs up. This patch creates
one thread per vm in which we try to reconnect to monitor. Therefore,
blocking in one thread will not affect other APIs.
This patch annotates APIs with low or high priority.
In low set MUST be all APIs which might eventually access monitor
(and thus block indefinitely). Other APIs may be marked as high
priority. However, some must be (e.g. domainDestroy).
For high priority calls (HPC), there are some high priority workers
(HPW) created in the pool. HPW can execute only HPC, although normal
worker can process any call regardless priority. Therefore, only those
APIs which are guaranteed to end in reasonable small amount of time
can be marked as HPC.
The size of this HPC pool is static, because HPC are expected to end
quickly, therefore jobs assigned to this pool will be served quickly.
It can be configured in libvirtd.conf via prio_workers variable.
Default is set to 5.
To mark API with low or high priority, append priority:{low|high} to
it's comment in src/remote/remote_protocol.x. This is similar to
autogen|skipgen. If not marked, the generator assumes low as default.
I got confused when 'virsh domblkinfo dom disk' required the
path to a disk (which can be ambiguous, since a single file
can back multiple disks), rather than the unambiguous target
device name that I was using in disk snapshots. So, in true
developer fashion, I went for the best of both worlds - all
interfaces that operate on a disk (aka block) now accept
either the target name or the unambiguous path to the backing
file used by the disk.
* src/conf/domain_conf.h (virDomainDiskIndexByName): Add
parameter.
(virDomainDiskPathByName): New prototype.
* src/libvirt_private.syms (domain_conf.h): Export it.
* src/conf/domain_conf.c (virDomainDiskIndexByName): Also allow
searching by path, and decide whether ambiguity is okay.
(virDomainDiskPathByName): New function.
(virDomainDiskRemoveByName, virDomainSnapshotAlignDisks): Update
callers.
* src/qemu/qemu_driver.c (qemudDomainBlockPeek)
(qemuDomainAttachDeviceConfig, qemuDomainUpdateDeviceConfig)
(qemuDomainGetBlockInfo, qemuDiskPathToAlias): Likewise.
* src/qemu/qemu_process.c (qemuProcessFindDomainDiskByPath):
Likewise.
* src/libxl/libxl_driver.c (libxlDomainAttachDeviceDiskLive)
(libxlDomainDetachDeviceDiskLive, libxlDomainAttachDeviceConfig)
(libxlDomainUpdateDeviceConfig): Likewise.
* src/uml/uml_driver.c (umlDomainBlockPeek): Likewise.
* src/xen/xend_internal.c (xenDaemonDomainBlockPeek): Likewise.
* docs/formatsnapshot.html.in: Update documentation.
* tools/virsh.pod (domblkstat, domblkinfo): Likewise.
* docs/schemas/domaincommon.rng (diskTarget): Tighten pattern on
disk targets.
* docs/schemas/domainsnapshot.rng (disksnapshot): Update to match.
* tests/domainsnapshotxml2xmlin/disk_snapshot.xml: Update test.
It is not possible to change the label of a TCP socket once it
has been opened. When creating a TCP socket care must be taken
to ensure the socket creation label is set & then cleared.
Remove the bogus call to virSecurityManagerSetProcessFDLabel
from the lock driver guest setup code and instead make use of
virSecurityManagerSetSocketLabel