For the PPC64 pseries machine type we need to add address information
for the spapr-vty device.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
When parsing ppc64 models on an x86 host an out-of-memory error message is displayed due
to it checking for retcpus being NULL. Fix this by removing the check whether retcpus is NULL
since we will realloc into this variable.
Also in the X86 model parser display the OOM error at the location where it happens.
A preparatory patch for DHCP snooping where we want to be able to
differentiate between a VM's interface using the tuple of
<VM UUID, Interface MAC address>. We assume that MAC addresses could
possibly be re-used between different networks (VLANs) thus do not only
want to rely on the MAC address to identify an interface.
At the current 'final destination' in virNWFilterInstantiate I am leaving
the vmuuid parameter as ATTRIBUTE_UNUSED until the DHCP snooping patches arrive.
(we may not post the DHCP snooping patches for 0.9.9, though)
Mostly this is a pretty trivial patch. On the lowest layers, in lxc_driver
and uml_conf, I am passing the virDomainDefPtr around until I am passing
only the VM's uuid into the NWFilter calls.
This patch cleans up return codes in the nwfilter subsystem.
Some functions in nwfilter_conf.c (validators and formatters) are
keeping their bool return for now and I am converting their return
code to true/false.
All other functions now have failure return codes of -1 and success
of 0.
[I searched for all occurences of ' 1;' and checked all 'if ' and
adapted where needed. After that I did a grep for 'NWFilter' in the source
tree.]
assumptions from generic code.
This implements the minimal set of changes needed in libvirt to launch a
PowerPC-KVM based guest.
It removes x86-specific assumptions about choice of serial driver backend
from generic qemu guest commandline generation code.
It also restricts the ACPI capability to be available for an x86 or
x86_64 domain.
This is not a complete solution -- it still does not guarantee libvirt
the capability to flag non-supported options in guest XML. (Eg, an ACPI
specification in a PowerPC guest XML will still get processed, even
though qemu-system-ppc64 does not support it while qemu-system-x86_64 does.)
This drawback exists because libvirt falls back on qemu to query supported
features, and qemu '-h' blindly lists all capabilities -- irrespective
of whether they are available while emulating a given architecture or not.
The long-term solution would be for qemu to list out capabilities based
on architecture and platform -- so that libvirt can cleanly make out what
devices are supported on an arch (say 'ppc64') and platform (say, 'mac99').
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
This enables libvirt to select the correct qemu binary (qemu-system-ppc64)
for a guest vm based on arch 'ppc64'.
Also, libvirt is enabled to correctly parse the list of supported PowerPC
CPUs, generated by running 'qemu-system-ppc64 -cpu ?'
Signed-off-by: Prerna Saxena <prerna@linux.vnet.ibm.com>
Acked-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
With security_driver set to "none" in /etc/libvirt/qemu.conf,
libvirtd would crash when attempted to attach to an existing
qemu process. Only copy the security model if it actually exists.
During virDomainDestroy, QEMU may emit SHUTDOWN event as a response to
SIGTERM and since domain object is still locked, the event is processed
after the domain is destroyed. We need to ignore this event in such case
to avoid changing domain state from shutoff to shutdown.
When QEMU guest finishes its shutdown sequence, qemu stops virtual CPUs
and when started with -no-shutdown waits for us to kill it using
SGITERM. Since QEMU is flushing its internal buffers, some time may pass
before QEMU actually dies. We mistakenly used "paused" state (and
events) for this which is quite confusing since users may see a domain
going to pause while they expect it to shutdown. Since we already have
"shutdown" state with "the domain is being shut down" semantics, we
should use it for this state.
However, the state didn't have a corresponding event so I created one
and called its detail as VIR_DOMAIN_EVENT_SHUTDOWN_FINISHED (guest OS
finished its shutdown sequence) with the intent to add
VIR_DOMAIN_EVENT_SHUTDOWN_STARTED in the future if we have a
sufficiently capable guest agent that can notify us when guest OS starts
to shutdown.
Fix a logic error, the initial value of ret = -1, if just set --config,
it will goto endjob directly without doing its really job here.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
filter 0-device-weight when:
- getting blkio parameters with --config
- starting up a domain
When testing with blkio, I found these issues:
(dom is down)
virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,500
virsh blkiotune dom --device-weights /dev/sda,300,/dev/sdb,0
virsh blkiotune dom
weight : 800
device_weight : /dev/sda,200,/dev/sdb,0
# issue 1: shows 0 device weight of /dev/sdb that may confuse user
(continued)
virsh start dom
# issue 2: If /dev/sdb doesn't exist, libvirt refuses to bring the
# dom up because it wants to set the device weight to 0 of a
# non-existing device. Since 0 means no weight-limit, we really don't
# have to set it.
Prior to this patch, for a running dom, the commands:
$ virsh blkiotune dom --device-weights /dev/sda,502,/dev/sdb,498
$ virsh blkiotune dom --device-weights /dev/sda,503
$ virsh blkiotune dom
weight : 500
device_weight : /dev/sda,503
claim that /dev/sdb no longer has a non-default weight, but
directly querying cgroups says otherwise:
$ cat /cgroup/blkio/libvirt/qemu/dom/blkio.weight_device
8:0 503
8:16 498
After this patch, an explicit 0 is required to remove a device path
from the XML, and omitting a device path that was previously
specified leaves that device path untouched in the XML, to match
cgroups behavior.
* src/qemu/qemu_driver.c (parseBlkioWeightDeviceStr): Rename...
(qemuDomainParseDeviceWeightStr): ...and use correct type.
(qemuDomainSetBlkioParameters): After parsing string, modify
rather than replacing existing table.
* tools/virsh.pod (blkiotune): Tweak wording.
Implement the block I/O throttle setting and getting support to qemu
driver.
Signed-off-by: Lei Li <lilei@linux.vnet.ibm.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The virTimestamp and virTimeMs functions in src/util/util.h
duplicate functionality from virtime.h, in a non-async signal
safe manner. Remove them, and convert all code over to the new
APIs.
* src/util/util.c, src/util/util.h: Delete virTimeMs and virTimestamp
* src/lxc/lxc_driver.c, src/qemu/qemu_domain.c,
src/qemu/qemu_driver.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/util/event_poll.c: Convert to use
virtime APIs
If we ensure that virNodeSuspendGetTargetMask always resets
*bitmask to zero upon failure, there is no need for the
powerMgmt_valid field.
* src/util/virnodesuspend.c: Ensure *bitmask is zero upon
failure
* src/conf/capabilities.c, src/conf/capabilities.h: Remove
powerMgmt_valid field
* src/qemu/qemu_capabilities.c: Remove powerMgmt_valid
The node suspend capabilities APIs should not have been put into
util.[ch]. Instead move them into virnodesuspend.[ch]
* src/util/util.c, src/util/util.h: Remove suspend capabilities APIs
* src/util/virnodesuspend.c, src/util/virnodesuspend.h: Add
suspend capabilities APIs
* src/qemu/qemu_capabilities.c: Include virnodesuspend.h
Rename virGetPMCapabilities to virNodeSuspendGetTargetMask and
virDiscoverHostPMFeature to virNodeSuspendSupportsTarget.
* src/util/util.c, src/util/util.h: Rename APIs
* src/qemu/qemu_capabilities.c, src/util/virnodesuspend.c: Adjust
for new names
Without this, 'virsh blkiotune --live --config --weight=n'
only affected live.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Allow
setting both configurations at once.
After the previous patch, there are now some redundant checks.
* src/qemu/qemu_driver.c (qemudDomainGetVcpuPinInfo)
(qemuGetSchedulerParametersFlags): Drop checks now guaranteed by
libvirt.c.
* src/lxc/lxc_driver.c (lxcGetSchedulerParametersFlags):
Likewise.
It requires the domain is running, otherwise fails. Resize to a lower
size is supported, but should be used with extreme caution.
In order to prohibit the "size" overflowing after multiplied by
1024. We do checking in the codes. For QMP mode, the default units
is Bytes, the passed size needs to be multiplied by 1024, however,
for HMP mode, the default units is "Megabytes", the passed "size"
needs to be divided by 1024 then.
Implements functions for both HMP and QMP mode.
For HMP mode, qemu uses "M" as the units by default, so the passed "sized"
is divided by 1024.
For QMP mode, qemu uses "Bytes" as the units by default, the passed "sized"
is multiplied by 1024.
All of the monitor functions return -1 on failure, 0 on success, or -2 if
not supported.
Add the core functions that implement the functionality of the API.
Suspend is done by using an asynchronous mechanism so that we can return
the status to the caller before the host gets suspended. This asynchronous
operation is achieved by suspending the host in a separate thread of
execution. However, returning the status to the caller is only best-effort,
but not guaranteed.
To resume the host, an RTC alarm is set up (based on how long we want to
suspend) before suspending the host. When this alarm fires, the host
gets woken up.
Suspend-to-RAM operation on a host running Linux can take upto more than 20
seconds, depending on the load of the system. (Freezing of tasks, an operation
preceding any suspend operation, is given up after a 20 second timeout).
And Suspend-to-Disk can take even more time, considering the time required
for compaction, creating the memory image and writing it to disk etc.
So, we do not allow the user to specify a suspend duration of less than 60
seconds, to be on the safer side, since we don't want to prematurely declare
failure when we only had to wait for some more time.
If a connection to destination host is lost during peer-to-peer
migration (because keepalive protocol timed out), we won't be able to
finish the migration and it doesn't make sense to wait for qemu to
transmit all data. This patch automatically cancels such migration
without waiting for virDomainAbortJob to be called.
If something fails while initializing qemu job object in
qemuDomainObjPrivateAlloc(), memory to the private pointer is freed, but
after that, the pointer is still dereferenced, which may result in a
segfault.
* qemuDomainObjPrivateAlloc() - Don't dereference NULL pointer.
Generally, functions which return malloc'd strings should be typed
as 'char *', not 'const char *', to make it obvious that the caller
is responsible to free things. free(const char *) fails to compile,
and although we have a cast embedded in VIR_FREE to work around poor
code that frees const char *, it's better to not rely on that hack.
* src/qemu/qemu_driver.c (qemuDiskPathToAlias): Change return type.
(qemuDomainBlockJobImpl): Update caller.
Commit 89b6284f made it possible to pass either a source name or
the target device to most API demanding a disk designation, but
forgot to update the documentation. It also failed to update
virDomainBlockStats to take both forms. This patch fixes both the
documentation and the remaining function.
Xen continues to use just device shorthand (that is, I did not
implement path lookup there, since xen does not track a domain_conf
to quickly tie a path back to the device shorthand).
* src/libvirt.c (virDomainBlockStats, virDomainBlockStatsFlags)
(virDomainGetBlockInfo, virDomainBlockPeek)
(virDomainBlockJobAbort, virDomainGetBlockJobInfo)
(virDomainBlockJobSetSpeed, virDomainBlockPull): Document
acceptable disk naming conventions.
* src/qemu/qemu_driver.c (qemuDomainBlockStats)
(qemuDomainBlockStatsFlags): Allow lookup by source name.
* src/test/test_driver.c (testDomainBlockStats): Likewise.
This patch exports KVM Host Power Management capabilities as XML so that
higher-level systems management software can make use of these features
available in the host.
The script "pm-is-supported" (from pm-utils package) is run to discover if
Suspend-to-RAM (S3) or Suspend-to-Disk (S4) is supported by the host.
If either of them are supported, then a new tag "<power_management>" is
introduced in the XML under the <host> tag.
However in case the query to check for power management features succeeded,
but the host does not support any such feature, then the XML will contain
an empty <power_management/> tag. In the event that the PM query itself
failed, the XML will not contain any "power_management" tag.
To use this, new APIs could be implemented in libvirt to exploit power
management features such as S3/S4.
For direct attach devices, in qemuBuildCommandLine, we seem to be freeing
actual device on error path (with networkReleaseActualDevice). But the actual
device is not deleted.
qemuProcessStop eventually deletes the direct attach device and releases
actual device. But by the time qemuProcessStop is called qemuBuildCommandLine
has already freed actual device, leaving stray macvtap devices behind on error.
So the simplest fix is to remove the networkReleaseActualDevice in
qemuBuildCommandLine. This patch does just that.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Now, when we support multiple consoles per domain,
the vm->def->console[0] can still remain an alias
for vm->def->serial[0]; However, we need to copy
it's source definition as well otherwise we'll regress
on virDomainOpenConsole.
This prepares for subsequent patches which introduce dependence
on cgroup cpuset. Enable cgroup cpuset by default so users don't
have to modify configuration file before encountering a cpuset
error.
Update virNetDevMacVLanCreateWithVPortProfile to allow creation
of plain macvlan devices, as well as macvtap devices. The former
is useful for LXC containers
* src/qemu/qemu_command.c: Explicitly request a macvtap device
* src/util/virnetdevmacvlan.c, src/util/virnetdevmacvlan.h: Add
new flag to allow switching between macvlan and macvtap
creation
Rename virNetDevMacVLanCreate to virNetDevMacVLanCreateWithVPortProfile
and virNetDevMacVLanDelete to virNetDevMacVLanDeleteWithVPortProfile
To make way for renaming the other macvlan creation APIs in
interface.c
* util/virnetdevmacvlan.c, util/virnetdevmacvlan.h,
qemu/qemu_command.c, qemu/qemu_hotplug.c, qemu/qemu_process.c:
Rename APIs
Rename the macvtap.c file to virnetdevmacvlan.c to reflect its
functionality. Move the port profile association code out into
virnetdevvportprofile.c. Make the APIs available unconditionally
to callers
* src/util/macvtap.h: rename to src/util/virnetdevmacvlan.h,
* src/util/macvtap.c: rename to src/util/virnetdevmacvlan.c
* src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h:
Pull in vport association code
* src/Makefile.am, src/conf/domain_conf.h, src/qemu/qemu_conf.c,
src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update include
paths & remove conditional compilation
In preparation for code re-organization, rename the Macvtap
management APIs to have the following patterns
virNetDevMacVLanXXXXX - macvlan/macvtap interface management
virNetDevVPortProfileXXXX - virtual port profile management
* src/util/macvtap.c, src/util/macvtap.h: Rename APIs
* src/conf/domain_conf.c, src/network/bridge_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_command.h,
src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/qemu/qemu_process.h: Update for renamed APIs
Add routines to generate -numa QEMU command line option based on
<numa> ... </numa> XML specifications.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
This improves the support for qemu rbd devices by adding support for a few
key features (e.g., authentication) and cleaning up the way in which
rbd configuration options are passed to qemu.
An <auth> member of the disk source xml specifies how librbd should
authenticate. The username attribute is the Ceph/RBD user to authenticate as.
The usage or uuid attributes specify which secret to use. Usage is an
arbitrary identifier local to libvirt.
The old RBD support relied on setting an environment variable to
communicate information to qemu/librbd. Instead, pass those options
explicitly to qemu. Update the qemu argument parsing and tests
accordingly.
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
The src/util/network.c file is a dumping ground for many different
APIs. Split it up into 5 pieces, along functional lines
- src/util/virnetdevbandwidth.c: virNetDevBandwidth type & helper APIs
- src/util/virnetdevvportprofile.c: virNetDevVPortProfile type & helper APIs
- src/util/virsocketaddr.c: virSocketAddr and APIs
- src/conf/netdev_bandwidth_conf.c: XML parsing / formatting
for virNetDevBandwidth
- src/conf/netdev_vport_profile_conf.c: XML parsing / formatting
for virNetDevVPortProfile
* src/util/network.c, src/util/network.h: Split into 5 pieces
* src/conf/netdev_bandwidth_conf.c, src/conf/netdev_bandwidth_conf.h,
src/conf/netdev_vport_profile_conf.c, src/conf/netdev_vport_profile_conf.h,
src/util/virnetdevbandwidth.c, src/util/virnetdevbandwidth.h,
src/util/virnetdevvportprofile.c, src/util/virnetdevvportprofile.h,
src/util/virsocketaddr.c, src/util/virsocketaddr.h: New pieces
* daemon/libvirtd.h, daemon/remote.c, src/conf/domain_conf.c,
src/conf/domain_conf.h, src/conf/network_conf.c,
src/conf/network_conf.h, src/conf/nwfilter_conf.h,
src/esx/esx_util.h, src/network/bridge_driver.c,
src/qemu/qemu_conf.c, src/rpc/virnetsocket.c,
src/rpc/virnetsocket.h, src/util/dnsmasq.h, src/util/interface.h,
src/util/iptables.h, src/util/macvtap.c, src/util/macvtap.h,
src/util/virnetdev.h, src/util/virnetdevtap.c,
tools/virsh.c: Update include files
Rename the virVirtualPortProfileParams struct to be
virNetDevVPortProfile, and rename the APIs to match
this prefix.
* src/util/network.c, src/util/network.h: Rename port profile
APIs
* src/conf/domain_conf.c, src/conf/domain_conf.h,
src/conf/network_conf.c, src/conf/network_conf.h,
src/network/bridge_driver.c, src/qemu/qemu_hotplug.c,
src/util/macvtap.c, src/util/macvtap.h: Update for
renamed APIs/structs
Hi
Commit c31d23a787 removed the "conn"
parameter from qemuPhysIfaceConnect(), but it's still used if
WITH_MACVTAP is false. Also, it's still mentioned in the comment
above the function:
/**
* qemuPhysIfaceConnect:
* @def: the definition of the VM (needed by 802.1Qbh and audit)
* @conn: pointer to virConnect object
* @driver: pointer to the qemud_driver
* @net: pointer to he VM's interface description with direct device type
* @qemuCaps: flags for qemu
*
* Returns a filedescriptor on success or -1 in case of error.
*/
int
qemuPhysIfaceConnect(virDomainDefPtr def,
struct qemud_driver *driver,
virDomainNetDefPtr net,
virBitmapPtr qemuCaps,
enum virVMOperationType vmop)
{
int rc;
#if WITH_MACVTAP
[...]
#else
(void)def;
(void)conn;
(void)net;
(void)qemuCaps;
(void)driver;
(void)vmop;
qemuReportError(VIR_ERR_INTERNAL_ERROR,
"%s", _("No support for macvtap device"));
rc = -1;
#endif
return rc;
}
--
Michael Wood <esiotrot@gmail.com>
From f4fc43b4111a4c099395c55902e497b8965e2b53 Mon Sep 17 00:00:00 2001
From: Michael Wood <esiotrot@gmail.com>
Date: Sat, 12 Nov 2011 13:37:53 +0200
Subject: [PATCH] Fix build without MACVTAP.
Qemu will be the first driver to make use of a typed string in the
next round of additions. Separate out the trivial addition.
* src/qemu/qemu_driver.c (qemudSupportsFeature): Advertise feature.
(qemuDomainGetBlkioParameters, qemuDomainGetMemoryParameters)
(qemuGetSchedulerParametersFlags, qemudDomainBlockStatsFlags):
Allow typed strings flag where trivially supported.
This reverts commit ef1065cf5ac; see also this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=751900
In qemu 0.15.1 and earlier, during migration to file, the
qemu_savevm_state_begin and qemu_savevm_state_iterate methods
will both process as much migration data as possible until either
1. The file descriptor returns EAGAIN
2. The bandwidth rate limit is reached
If we set the rate limit to ULONG_MAX, test 2 never becomes true. We're
passing a plain file descriptor to QEMU and POSIX does not support EAGAIN on
regular files / block devices, so test 1 never becomes true either.
In the 'virsh save --bypass-cache' case, we pass a pipe instead of a
regular fd, but using a pipe adds I/O overhead, so always passing a
pipe just so qemu can see EAGAIN doesn't seem nice.
The ultimate fix needs to come from qemu - background migration must
respect asynchronous abort requests, or else periodically return
control to the main handling loop without an EAGAIN and without
waiting to hit an insanely large amount of data. But until a
version of qemu is fixed to support "unlimited" data rates while
still allowing cancellation, the best we can do is avoid the
automatic use of unlimited rates from within libvirt (users can
still explicitly change the migration rates, if they are aware that
they are giving up the ability to cancel a job).
Reverting the lone use of QEMU_DOMAIN_FILE_MIG_BANDWIDTH_MAX is
the simplest patch; this slows migration back down to a default
32M/sec cap, but also ensures that the main qemu processing loop
will still be responsive to cancellation requests. Hopefully
upstream qemu will provide us a means of safely using unlimited
speed, including a runtime probe of that capability.
* src/qemu/qemu_migration.c (qemuMigrationToFile): Revert attempt
to use unlimited migration bandwidth when migrating to file.
Signed-off-by: Daniel Veillard <veillard@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>