An upcoming patch has a use for a tap device to be created that
doesn't need to be actually put into the "up" state, and keeping it
"down" keeps the output of ifconfig from being unnecessarily cluttered
(ifconfig won't show down interfaces unless you add "-a").
bridge.[ch]: add "up" as an arg to brAddTap()
uml_conf.c, qemu_command.c: add "up" (set to "true") to brAddTap() call.
This is in response to:
https://bugzilla.redhat.com/show_bug.cgi?id=629662
Explanation
qemu's virtio-net-pci driver allows setting the algorithm used for tx
packets to either "bh" or "timer". This is done by adding ",tx=bh" or
",tx=timer" to the "-device virtio-net-pci" commandline option.
'bh' stands for 'bottom half'; when this is set, packet tx is all done
in an iothread in the bottom half of the driver. (In libvirt, this
option is called the more descriptive "iothread".)
'timer' means that tx work is done in qemu, and if there is more tx
data than can be sent at the present time, a timer is set before qemu
moves on to do other things; when the timer fires, another attempt is
made to send more data. (libvirt retains the name "timer" for this
option.)
The resulting difference, according to the qemu developer who added
the option is:
bh makes tx more asynchronous and reduces latency, but potentially
causes more processor bandwidth contention since the cpu doing the
tx isn't necessarily the cpu where the guest generated the
packets.
Solution
This patch provides a libvirt domain xml knob to change the option on
the qemu commandline, by adding a new attribute "txmode" to the
<driver> element that can be placed inside any <interface> element in
a domain definition. It's use would be something like this:
<interface ...>
...
<model type='virtio'/>
<driver txmode='iothread'/>
...
</interface>
I chose to put this setting as an attribute to <driver> rather than as
a sub-element to <tune> because it is specific to the virtio-net
driver, not something that is generally usable by all network drivers.
(note that this is the same placement as the "driver name=..."
attribute used to choose kernel vs. userland backend for the
virtio-net driver.)
Actually adding the tx=xxx option to the qemu commandline is only done
if the version of qemu being used advertises it in the output of
qemu -device virtio-net-pci,?
If a particular txmode is requested in the XML, and the option isn't
listed in that help output, an UNSUPPORTED_CONFIG error is logged, and
the domain fails to start.
When the <driver> element (and its "name" attribute) was added to the
domain XML's interface element, a "backend" enum was simply added to
the toplevel of the virDomainNetDef struct.
Ignoring the naming inconsistency ("name" vs. "backend"), this is fine
when there's only a single item contained in the driver element of the
XML, but doesn't scale well as we add more attributes that apply to
the backend of the virtio-net driver, or add attributes applicable to
other drivers.
This patch changes virDomainNetDef in two ways:
1) Rename the item in the struct from "backend" to "name", so that
it's the same in the XML and in the struct, hopefully avoiding
confusion for someone unfamiliar with the function of the
attribute.
2) Create a "driver" union within virDomainNetDef, and a "virtio"
struct in that struct, which contains the "name" enum value.
3) Move around the virDomainNetParse and virDomainNetFormat functions
to allow for simple plugin of new attributes without disturbing
existing code. (you'll note that this results in a seemingly
redundant if() in the format function, but that will no longer be
the case as soon as a 2nd attribute is added).
In the future, new attributes for the virtio driver backend can be
added to the "virtio" struct, and any other network device backend that
needs an attribute will have its own struct added to the "driver"
union.
The introduction of the v3 migration protocol, along with
support for migration cookies, will significantly expand
the size of the migration code. Move it all to a separate
file to make it more manageable
The functions are not moved 100%. The API entry points
remain in the main QEMU driver, but once the public
virDomainPtr is resolved to the internal virDomainObjPtr,
all following code is moved.
This will allow the new v3 API entry points to call into the
same shared internal migration functions
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
qemuDomainFormatXML helper method
* src/qemu/qemu_driver.c: Remove all migration code
* src/qemu/qemu_migration.c, src/qemu/qemu_migration.h: Add
all migration code.
Move the qemudStartVMDaemon and qemudShutdownVMDaemon
methods into a separate file, renaming them to
qemuProcessStart, qemuProcessStop. All helper methods
called by these are also moved & renamed to match
* src/Makefile.am: Add qemu_process.c/.h
* src/qemu/qemu_command.c: Add qemuDomainAssignPCIAddresses
* src/qemu/qemu_command.h: Add VNC port min/max
* src/qemu/qemu_domain.c, src/qemu/qemu_domain.h: Add
domain event queue helpers
* src/qemu/qemu_driver.c, src/qemu/qemu_driver.h: Remove
all QEMU process startup/shutdown functions
* src/qemu/qemu_process.c, src/qemu/qemu_process.h: Add
all QEMU process startup/shutdown functions
The name convention of device mapper disk is different, and 'parted'
can't be used to delete a device mapper disk partition. e.g.
Name Path
-----------------------------------------
3600a0b80005ad1d7000093604cae912fp1 /dev/mapper/3600a0b80005ad1d7000093604cae912fp1
Error: Expecting a partition number.
This patch introduces 'dmsetup' to fix it.
Changes:
- New function "virIsDevMapperDevice" in "src/utils/utils.c"
- remove "is_dm_device" in "src/storage/parthelper.c", use
"virIsDevMapperDevice" instead.
- Requires "device-mapper" for 'with-storage-disk" in "libvirt.spec.in"
- Check "dmsetup" in 'configure.ac' for "with-storage-disk"
- Changes on "src/Makefile.am" to link against libdevmapper
- New entry for "virIsDevMapperDevice" in "src/libvirt_private.syms"
Changes from v1 to v3:
- s/virIsDeviceMapperDevice/virIsDevMapperDevice/g
- replace "virRun" with "virCommand"
- sort the list of util functions in "libvirt_private.syms"
- ATTRIBUTE_NONNULL(1) for virIsDevMapperDevice declaration.
e.g.
Name Path
-----------------------------------------
3600a0b80005ad1d7000093604cae912fp1 /dev/mapper/3600a0b80005ad1d7000093604cae912fp1
Vol /dev/mapper/3600a0b80005ad1d7000093604cae912fp1 deleted
Name Path
-----------------------------------------
"qemudDomainSaveFlag" goto wrong label "endjob", which will cause
error when security manager trying to restore label (regression).
As it's more reasonable to check if vm is shutoff immediately, and
return right away if it is, remove the checking in "qemudDomainSaveFlag",
and add checking in "qemudDomainSave".
* src/qemu/qemu_driver.c
* src/util/cgroup.c (virCgroupSetValueStr, virCgroupGetValueStr)
(virCgroupRemoveRecursively): VIR_DEBUG can clobber errno.
(virCgroupRemove): Use VIR_DEBUG rather than DEBUG.
The code expected that host CPU architecture matches the architecture on
which libvirt runs. This is normally true but not in tests, where host
CPU is faked to produce consistent results.
clang had 5 reports against virCommand; three were false positives
(a NULL deref in ProcessIO solved by sa_assert, and two uninitialized
memory operations solved by adding an initializer), but two were real.
* src/util/command.c (virCommandProcessIO): Fix real bug of
possible NULL dereference. Teach clang that buf is never NULL.
(virCommandRun): Teach clang that infd is only ever accessed when
initialized.
The processWatchdogEvent fix is real, although it can only trigger
on OOM, since bad things happen if doCoreDump is called with a NULL
pathname argument. The other fixes silence clang, but aren't a real
bug because virReportErrorHelper tolerates a NULL format string even
though *printf does not.
* src/qemu/qemu_driver.c (processWatchdogEvent): Exit on OOM.
(qemuDomainIsActive, qemuDomainIsPersistent, qemuDomainIsUpdated):
Provide valid message.
The SCSI storage backend leaks a string containing the pathname
for each block device it discovers
* src/storage/storage_backend_scsi.c: Free the device name
When creating the virDomain::snapshots hash table, virGetDomain
wasn't checking if the creation was successful. This would then
lead to failures in the vir*DomainSnapshot functions. Better to
report this error early and make virGetDomain fail if the
snapshots hash couldn't be created.
* src/datatypes.c: report failure to make a hash table
A couple of allocation were not calling virReportOOMError on allocation
errors
* src/util/hash.c: add the needed call in virHashCreate and
virHashAddOrUpdateEntry
"virStorageBackendCreateVols":
"names->next" serves as condition expression for "do...while",
however, "names" was shifted before, it then results in one less
loop, and thus, one less volume will be created for mpath pool,
the patch is to fix it.
* src/storage/storage_backend_mpath.c
clang complained that STREQ(group->controllers[i].mountPoint,...) was
a NULL dereference when i==VIR_CGROUP_CONTROLLER_CPUSET, because it
assumes the worst about virCgroupPathOfController. Marking the
argument const doesn't yet have an effect, per this clang bug:
http://llvm.org/bugs/show_bug.cgi?id=7758
So, we use sa_assert, which was designed to shut up false positives
from tools like clang.
* src/util/cgroup.c (virCgroupMakeGroup): Teach clang that there
is no NULL dereference.
This patch reorders the connlimit and comment match extensions relative to the state match (-m state); connlimit being most useful if found after a -m state --state NEW and not before it.
When formatting XML for smartcard device with mode=host, libvirt
generates invalid XML if the device has address info associated:
<smartcard mode='host' <address type='ccid' controller='0' slot='1'/>
Commit 9962e406c664ed5521f5aca500c860a331cb3979 introduced a
problem where if the VM failed to startup, it would not be
correctly cleaned up. Amongst other things the SELinux
security label would not be removed, which prevents the VM
from ever starting again.
The virDomainIsActive() check at the start of qemudShutdownVMDaemon
checks for vm->def->id not being -1. By moving the assignment of the
VM id to the start of qemudStartVMDaemon, we can ensure cleanup will
occur on failure
* src/qemu/qemu_driver.c: Move initialization of 'vm->def->id'
so that qemudShutdownVMDaemon() will process the shutdown
When attaching a device that already exists, xend driver updates
the device with "device_configure", it causes problems (e.g. for
disk device, 'device_configure' only can be used to update device
like CDROM), on the other hand, we provide additional API
(virDomainUpdateDevice) to update device, this fix is to raise up
errors instead of updating the existed device which is not CDROM
device.
Changes from v1 to v2:
- allow to update CDROM
* src/xen/xend_internal.c
Building the 0.8.8 release candidate on cygwin produced this compiler
warning, which is indicative of catastrophic failure on any attempt to
print an error message with errno turned to a string:
CC strerror_r.lo
strerror_r.c: In function 'rpl_strerror_r':
strerror_r.c:67: warning: assignment makes integer from pointer without a cast
This has been fixed in gnulib.
* .gnulib: Update to latest, for strerror_r fix.
* src/util/memory.c (includes): Satisfy 'make syntax-check'.
Suspending a VM which contains shell meta characters doesn't work with
libvirt-0.8.7:
/var/log/libvirt/qemu/andreas_231-ne\ doch\ nicht.log:
sh: -c: line 0: syntax error near unexpected token `doch'
sh: -c: line 0: `cat | { dd bs=4096 seek=1 if=/dev/null && dd bs=1048576; }
Although target="andreas_231-ne doch nicht" contains shell meta
characters (here: blanks), they are not properly escaped by
src/qemu/qemu_monitor_{json,text}.c#qemuMonitor{JSON,Text}MigrateToFile()
First, the filename needs to be properly escaped for the shell, than
this command line has to be properly escaped for qemu again.
For this to work, remove the old qemuMonitorEscapeArg() wrapper, rename
qemuMonitorEscape() to it removing the handling for shell=TRUE, and
implement a new qemuMonitorEscapeShell() returning strings using single
quotes.
Using double quotes or escaping special shell characters with backslashes
would also be possible, but the set of special characters heavily
depends on the concrete shell (dsh, bash, zsh) and its setting (history
expansion, interactive use, ...)
Signed-off-by: Philipp Hahn <hahn@univention.de>
The logging functions are enhanced so that immediately prior to
the first log message being printed to any output channel, the
libvirt package version will be printed.
eg
$ LIBVIRT_DEBUG=1 virsh
18:13:28.013: 17536: info : libvirt version: 0.8.7
18:13:28.013: 17536: debug : virInitialize:361 : register drivers
...
The 'configure' script gains two new arguments which can be
used as
--with-packager="Fedora Project, x86-01.phx2.fedoraproject.org, 01-27-2011-18:00:10"
--with-packager-version="1.fc14"
to allow distros to append a custom string with package specific
data.
The RPM specfile is modified so that it appends the RPM version,
the build host, the build date and the packager name.
eg
$ LIBVIRT_DEBUG=1 virsh
18:14:52.086: 17551: info : libvirt version: 0.8.7, package: 1.fc13 (Fedora Project, x86-01.phx2.fedoraproject.org, 01-27-2011-18:00:10)
18:14:52.086: 17551: debug : virInitialize:361 : register drivers
Thus when distro packagers receive bug reports they can clearly
see what version was in use, even if the bug reporter mistakenly
or intentionally lies about version/builds
* src/util/logging.c: Output version data prior to first log message
* libvirt.spec.in: Include RPM release, date, hostname & packager
* configure.ac: Add --with-packager & --with-packager-version args
QEMUD_CMD_FLAG_PCI_MULTIBUS should be set in the function
qemuCapsExtractVersionInfo()
The flag QEMUD_CMD_FLAG_PCI_MULTIBUS is used in the function
qemuBuildDeviceAddressStr(). All callers get qemuCmdFlags
by the function qemuCapsExtractVersionInfo() except that
testCompareXMLToArgvFiles() in qemuxml2argvtest.c.
So we should set QEMUD_CMD_FLAG_PCI_MULTIBUS in the function
qemuCapsExtractVersionInfo() instead of qemuBuildCommandLine()
because the function qemuBuildCommandLine() does not be called
when we attach a pci device.
tests: set QEMUD_CMD_FLAG_PCI_MULTIBUS in testCompareXMLToArgvFiles()
set QEMUD_CMD_FLAG_PCI_MULTIBUS before calling qemuBuildCommandLine()
as the flags is not set by qemuCapsExtractVersionInfo().
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Quite a few hosts don't have cgroups mounted and so see warnings
from libvirt logged, which then cause bug reports, etc. Reduce
the log level to INFO so they're not visible by default
* src/qemu/qemu_driver.c: Reduce log level for cgroups
When run non-root the nwfilter driver logs error messages about
being unable to find iptables/ebtables commands (they are in
/sbin which isn't in $PATH). The nwfilter driver can't ever work
as non-root, so simply skip it entirely thus avoiding the error
messages
* src/conf/nwfilter_conf.h, src/nwfilter/nwfilter_driver.c,
src/nwfilter/nwfilter_gentech_driver.c,
src/nwfilter/nwfilter_gentech_driver.h: Pass 'bool privileged'
flag down to final driver impl
* src/nwfilter/nwfilter_ebiptables_driver.c: Skip initialization
if not privileged
A typo s/spice/vnc/ caused parsing of the spice 'auth' data
to write into the wrong part of the struct, blowing away
other unrelated data.
* src/conf/domain_conf.c: s/vnc/spice/ in parsing spice auth
Most of te VIR_INFO calls in the udev driver are only relevant
to developers so can switch to VIR_DEBUG. Failure to initialize
libpciaccess though is a fatal error
* src/node_device/node_device_udev.c: Adjust log levels
When probing machine types if the QEMU binary does not exist
we get a hard to diagnose error, due to the execve() in the
child failing
error: internal error Child process exited with status 1.
Add an explicit check so that we get
error: Cannot find QEMU binary /usr/libexec/qem3u-kvm: No such file or directory
* src/qemu/qemu_capabilities.c: Check for QEMU binary
To make it easier to investigate problems with async event
delivery, add two more debugging lines
* daemon/remote.c: Debug when an event is queued for dispatch
* src/remote/remote_driver.c: Debug when an event is received
for processing
When built as modules, the connection drivers live
in $LIBDIR/libvirt/drivers. Now we add lock manager
drivers, we need to distinguish. So move the existing
modules to 'connection-driver'
* src/Makefile.am: Move module install dir
* src/driver.c: Move module search dir
Some functionality run in virExec hooks may do I/O which
can trigger SIGPIPE. Renable SIGPIPE blocking around the
hook function
* src/util/util.c: Block SIGPIPE around hooks
The Linux kernel headers don't have a value for SCSI type 12,
but HAL source code shows this to be a 'raid'. Add workaround
for this type. Lower log level for unknown types since
this is not a fatal error condition. Include the device sysfs
path in the log output to allow identification of which device
has problems.
* src/node_device/node_device_udev.c: Add SCSI RAID type
libpciaccess has many bugs in its pci_system_init/cleanup
functions that makes calling them multiple times unwise.
eg it will double close() FDs, and leak other FDs.
* src/node_device/node_device_udev.c: Only initialize
libpciaccess once
Until now, user namespaces have not done much, but (for that
reason) have been innocuous to glob in with other CLONE_
flags. Upcoming userns development, however, will make tasks
cloned with CLONE_NEWUSER far more restricted. In particular,
for some time they will be unable to access files with anything
other than the world access perms.
This patch assumes that noone really needs the user namespaces
to be enabled. If that is wrong, then we can try a more
baroque patch where we create a file owned by a test userid with
700 perms and, if we can't access it after setuid'ing to that
userid, then return 0. Otherwise, assume we are using an
older, 'harmless' user namespace implementation.
Comments appreciated. Is it ok to do this?
Signed-off-by: Serge Hallyn <serge.hallyn@canonical.com>