virNetworkAssignDef was allocating a new network object, initing and
grabbing its lock, then potentially freeing it without unlocking or
destroying the lock. In practice 1) this will probably never happen,
and 2) even if it did, the lock implementation used on most (all?)
platforms doesn't actually hold any resources for an initialized or
held lock, but it still bothered me, so I moved the realloc that could
lead to this bad situation earlier in the function, and now the mutex
isn't inited or locked until we are assured of complete success.
These two objects were previously always parsed as a part of an IpDef,
but we will now need to be able to parse them on their own for
virNetworkUpdate(). Split the parsing functions out, with no
functional changes.
Simply returns the object list. Supports to filter the secrets
by its storage location, and whether it's private or not.
src/secret/secret_driver.c: Implement listAllSecrets
src/conf/node_device_conf.h:
* New macro VIR_CONNECT_LIST_NODE_DEVICES_FILTERS_CAP
* Declare virNodeDeviceList
src/conf/node_device_conf.c:
* New helpers virNodeDeviceCapMatch, virNodeDeviceMatch.
virNodeDeviceCapMatch looks up the list of all the caps the device
support, to see if the device support the cap type.
* Implement virNodeDeviceList
src/libvirt_private.syms:
* Export virNodeDeviceList
* Export virNodeDevCapTypeFromString
New options is added to support EOI (End of Interrupt) exposure for
guests. As it makes sense only when APIC is enabled, I added this into
the <apic> element in <features> because this should be tri-state
option (cannot be handled as standalone feature).
The 'def->target.addr' hasn't been initialized in virDomainChrDefNew() and
its value is always '0xffffffff', in addition, the following test scenario
hasn't also include 'address' element in channel XML block, so the branch
'if (addrStr == NULL)' is hit in virDomainChrDefParseTargetXML(), the
programming jumps to 'error' label to release relevant resources, and the
statement 'if (VIR_ALLOC(def->target.addr) < 0)' hasn't been executed then
the virDomainChrDefFree() will free 'def->target.addr'(0xffffffff) via
VIR_FREE(), which results in libvirt crash, to use valgrind can also
find a 'Invalid free() / delete / delete[]' error. This patch just adjusts
codes order to initialize 'def->target.addr' firstly.
With this patch, libvirt hasn't crash and can get a expected error message "
XML error: guestfwd channel does not define a target address".
How to reproduce?
1. define a guest with the following channel XML configuration
$ cat foo.xml
<snip>
<channel type='pty'>
<target type='guestfwd'/>
</channel>
</snip>
$ virsh define foo.xml
2. actual result
error: Failed to define domain from /tmp/foo.xml
error: End of file while reading data: Input/output error
error: Failed to reconnect to the hypervisor
GDB debugger information:
<snip>
Breakpoint 1, virDomainChrDefFree (def=0x7f8ab000ec70) at conf/domain_conf.c:1264
...ignore
1264 {
(gdb) p def->target
$2 = {port = -1, addr = 0xffffffff, name = 0xffffffff <Address 0xffffffff out of bounds>}
</snip>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=856489
Signed-off-by: Alex Jia <ajia@redhat.com>
If no private data needs to be maintained, it can be useful
to create virDomainObjPtr instances without having a virCapsPtr
instance around. Adapt the virDomainObjNew() function to allow
for a NULL caps
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=795929http://git.qemu.org/?p=qemu.git;a=commitdiff;h=6af165892cf900291046f1d25f95416f379504c2
This patch define and parse the input XML of USB redirection filter.
<devices>
...
<redirdev bus='usb' type='spicevmc'>
<address type='usb' bus='0' port='4'/>
</redirdev>
<redirfilter>
<usbdev class='0x08' vendor='0x1234' product='0xbeef' \
version='2.00' allow='yes'/>
<usbdev allow='no'/>
</redirfilter>
...
</devices>
There is no 1:1 mapping between ports and redirected devices and
qemu and spicy client couldn't decide into which usbredir ports
the client can 'plug' redirected devices. So it make sense to apply
all of filter rules global to all existing usb redirection devices.
class attribute is USB Class codes. version is bcdDevice value
of USB device. vendor and product is USB vendorId and productId.
-1 can be used to allow any value for a field. Except allow attribute
the other four are optional, default value is -1.
I got an off-list report about a bad diagnostic:
Target network card mac 52:54:00:49:07:ccdoes not match source 52:54:00:49:07:b8
True to form, I've added a syntax check rule to prevent it
from recurring, and found several other offenders.
* cfg.mk (sc_require_whitespace_in_translation): New rule.
* src/conf/domain_conf.c (virDomainNetDefCheckABIStability): Add
space.
* src/esx/esx_util.c (esxUtil_ParseUri): Likewise.
* src/qemu/qemu_command.c (qemuCollectPCIAddress): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSetMetadata)
(qemuDomainGetMetadata): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainChangeNetBridge): Likewise.
* src/rpc/virnettlscontext.c
(virNetTLSContextCheckCertDNWhitelist): Likewise.
* src/vmware/vmware_driver.c (vmwareDomainResume): Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainGetXMLDesc, vboxAttachDrives):
Avoid false negatives.
* tools/virsh-domain.c (info_save_image_dumpxml): Reword.
Based on a report by Luwen Su.
Historically, the first <console> element is treated as the
alias of a <serial> device. In the virDomainDeviceInfoIterate,
This situation is not considered. It still handles the first <console>
element as another devices, which means that for console[0] with
serial targetType, it calls callback function another time.
It will cause the problem of address conflicts when assigning
spapr-vio address for serial device on pSeries guest.
For pSeries guest, the serial configuration in the xml file
is as the following:
<serial type='pty'>
<target port='0'/>
<address type='spapr-vio'/>
</serial>
Console configuration is default, the dumped xml file is as the following:
<serial type='pty'>
<source path='/dev/pts/5'/>
<target port='0'/>
<alias name='serial0'/>
<address type='spapr-vio' reg='0x30000000'/>
</serial>
<console type='pty' tty='/dev/pts/5'>
<source path='/dev/pts/5'/>
<target type='serial' port='0'/>
<alias name='serial0'/>
<address type='spapr-vio' reg='0x30000000'/>
</console>
It shows that the <console> device is the alias of serial device.
So its address is the same as the serial device. When detecting
the conflicts in the qemuAssignSpaprVIOAddress the first console
and the serial device conflicts because virDomainDeviceInfoIterate()
still handle these as two different devices, and in the qemuAssignSpaprVIOAddress(),
it will compare these two devices' addressed. If they have same address,
it will report address conflict error.
So this patch is to handle the first console which targetType is serial
as the alias of serial device to avoid address conflicts error reported.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
src/conf/network_conf.c: Add virNetworkMatch to filter the networks;
and virNetworkList to iterate over all the networks with the filter.
src/conf/network_conf.h: Declare virNetworkList and define the macros
for filters.
src/libvirt_private.syms: Export virNetworkList.
GNOME Boxes sometimes stops getting domain events from libvirtd, even
after restarting it. Further investigation in libvirtd shows that
events are properly queued with virDomainEventStateQueue, but the
timer virDomainEventTimer which flushes the events and sends them to
the clients never gets called. Looking at the event queue in gdb
shows that it's non-empty and that its size increases with each new
events.
virDomainEventTimer is set up in virDomainEventStateRegister[ID]
when going from 0 client connecte to 1 client connected, but is
initially disabled. The timer is removed in
virDomainEventStateRegister[ID] when the last client is disconnected
(going from 1 client connected to 0).
This timer (which handles sending the events to the clients) is
enabled in virDomainEventStateQueue when queueing an event on an
empty queue (queue containing 0 events). It's disabled in
virDomainEventStateFlush after flushing the queue (ie removing all
the elements from it). This way, no extra work is done when the queue
is empty, and when the next event comes up, the timer will get
reenabled because the queue will go from 0 event to 1 event, which
triggers enabling the timer.
However, with this Boxes bug, we have a client connected (Boxes), a
non-empty queue (there are events waiting to be sent), but a disabled
timer, so something went wrong.
When Boxes connects (it's the only client connecting to the libvirtd
instance I used for debugging), the event timer is not set as expected
(state->timer == -1 when virDomainEventStateRegisterID is called),
but at the same time the event queue is not empty. In other words,
we had no clients connected, but pending events. This also explains
why the timer never gets enabled as this is only done when an event
is queued on an empty queue.
I think this can happen if an event gets queued using
virDomainEventStateQueue and the client disconnection happens before
the event timer virDomainEventTimer gets a chance to run and flush
the event. In this situation, virDomainEventStateDeregister[ID] will
get called with a non-empty event queue, the timer will be destroyed
if this was the only client connected. Then, when other clients connect
at a later time, they will never get notified about domain events as
the event timer will never get enabled because the timer is only
enabled if the event queue is empty when virDomainEventStateRegister[ID]
gets called, which will is no longer the case.
To avoid this issue, this commit makes sure to remove all events from
the event queue when the last client in unregistered. As there is
no longer anyone interested in receiving these events, these events
are stale so there is no need to keep them around. A client connecting
later will have no interest in getting events that happened before it
got connected.
src/conf/storage_conf.c: Add virStoragePoolMatch to filter the
pools; Add virStoragePoolList to iterate over the pool objects
with filter.
src/conf/storage_conf.h: Declare virStoragePoolMatch,
virStoragePoolList, and the macros for filters.
src/libvirt_private.syms: Export helper virStoragePoolList.
After discussion with DB we decided to rename the new iolimit
element as it creates the impression it would be there to
limit (i.e. throttle) I/O instead of specifying immutable
characteristics of a block device.
This is also backed by the fact that the term I/O Limits has
vanished from newer storage admin documentation.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
There is a new <pm/> element implemented that can control what ACPI
sleeping states will be advertised by BIOS and allowed to be switched
to by libvirt. The default keeps defaults on hypervisor, otherwise
forces chosen setting.
The documentation of the pm element is added as well.
Introducing a new iolimits element allowing to override certain
properties of a guest block device like the physical and logical
block size.
This can be useful for platforms with 'non-standard' disk formats
like S390 DASD with its 4K block size.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
To avoid backward compatibility issues, this patch suppresses
auto-generated DAC labels from XML. This change affects commands such as
dumpxml and save.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
With this patch libvirt tries to assign a model to a single seclabel
when model is missing. Libvirt will look up at host's capabilities and
assign the first model to seclabel.
This patch fixes:
1. The problem with existing guests that have a seclabel defined in its XML.
2. A XML parse error when a guest is restored.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
virDomainVcpuPinAdd does a realloc on vcpupin_list if the new vcpu pin
definition doesn't fit into the array. The list is an array of pointers
but the function definition didn't support returning the changed pointer
to the caller if it was realloced. This caused segfaults if realloc
would change the base pointer.
virDomainVcpuPinDefCopy when the control flow reaches out of memory
cleanup code, the flow would end in a infinite loop as the loop variable
wasn't decremented.
Also a dereference of NULL pointers was possible if allocation of the
Vcpu pinning definiton structure failed.
When checking for seclabels without security models, def->nseclabels is
already set to n. In the case of an error def->seclabels is freed but
nseclabels is left untouched. This leads to a segmentation fault when
def is freed in virDomainDefParseXML.
The name 'virDomainDiskSnapshot' didn't fit in with our normal
conventions of using a prefix hinting that it is related to a
virDomainSnapshotPtr. Also, a future patch will reuse the
enum for declaring where the VM memory is stored.
* src/conf/snapshot_conf.h (virDomainDiskSnapshot): Rename...
(virDomainSnapshotLocation): ...to this.
(_virDomainSnapshotDiskDef): Update clients.
* src/conf/domain_conf.h (_virDomainDiskDef): Likewise.
* src/libvirt_private.syms (domain_conf.h): Likewise.
* src/conf/domain_conf.c (virDomainDiskDefParseXML)
(virDomainDiskDefFormat): Likewise.
* src/conf/snapshot_conf.c: (virDomainSnapshotDiskDefParseXML)
(virDomainSnapshotAlignDisks, virDomainSnapshotDefFormat):
Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotDiskPrepare)
(qemuDomainSnapshotCreateSingleDiskActive)
(qemuDomainSnapshotCreateDiskActive, qemuDomainSnapshotCreateXML):
Likewise.
This has several benefits:
1. Future snapshot-related code has a definite place to go (and I
_will_ be adding some)
2. Snapshot errors now use the VIR_FROM_DOMAIN_SNAPSHOT error
classification, which has been underutilized (previously only in
libvirt.c)
* src/conf/domain_conf.h, domain_conf.c: Split...
* src/conf/snapshot_conf.h, snapshot_conf.c: ...into new files.
* src/Makefile.am (DOMAIN_CONF_SOURCES): Build new files.
* po/POTFILES.in: Mark new file for translation.
* src/vbox/vbox_tmpl.c: Update caller.
* src/esx/esx_driver.c: Likewise.
* src/qemu/qemu_command.c: Likewise.
* src/qemu/qemu_domain.h: Likewise.
We were failing to react to allocation failure when initializing
a snapshot object list. Changing things to store a pointer
instead of a complete object adds one more possible point of
allocation failure, but at the same time, will make it easier to
react to failure now, as well as making it easier for a future
patch to split all virDomainSnapshotPtr handling into a separate
file, as I continue to add even more snapshot code.
Luckily, there was only one client outside of domain_conf.c that
was actually peeking inside the object, and a new wrapper function
was easy.
* src/conf/domain_conf.h (_virDomainObj): Use a pointer.
(virDomainSnapshotObjListInit): Rename.
(virDomainSnapshotObjListFree, virDomainSnapshotForEach): New
declarations.
(_virDomainSnapshotObjList): Move definitions...
* src/conf/domain_conf.c: ...here.
(virDomainSnapshotObjListInit, virDomainSnapshotObjListDeinit):
Rename...
(virDomainSnapshotObjListNew, virDomainSnapshotObjListFree): ...to
these.
(virDomainSnapshotForEach): New function.
(virDomainObjDispose, virDomainListPopulate): Adjust callers.
* src/qemu/qemu_domain.c (qemuDomainSnapshotDiscard)
(qemuDomainSnapshotDiscardAllMetadata): Likewise.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/qemu/qemu_driver.c (qemuDomainSnapshotLoad)
(qemuDomainUndefineFlags, qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainListAllSnapshots)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren)
(qemuDomainSnapshotListAllChildren)
(qemuDomainSnapshotLookupByName, qemuDomainSnapshotGetParent)
(qemuDomainSnapshotGetXMLDesc, qemuDomainSnapshotIsCurrent)
(qemuDomainSnapshotHasMetadata, qemuDomainRevertToSnapshot)
(qemuDomainSnapshotDelete): Likewise.
* src/libvirt_private.syms (domain_conf.h): Export new function.
This patch introduces support of setting emulator's period and
quota to limit cpu bandwidth when the vm starts. Also updates
XML Schema for new entries and docs.
Introduce 2 APIs to support emulator threads pin.
1) virDomainEmulatorPinAdd: setup emulator threads pin with a given cpumap string.
2) virDomainEmulatorPinDel: remove all emulator threads pin.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
This patch adds a new xml element <emulatorpin>, which is a sibling
to the existing <vcpupin> element under the <cputune>, to pin emulator
threads to specified physical CPUs.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
A hypervisor may allow to override the disk geometry of drives.
Qemu, as an example with cyls=,heads=,secs=[,trans=].
This patch extends the domain config to allow the specification of
disk geometry with libvirt.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Port allocations for SPICE and VNC behave almost the same (with
default ports), but there is some mess in the code. This patch clears
these inconsistencies and makes sure the same behavior will be used
when ports for remote displays are changed.
Changes:
- hard-coded number 5900 removed (handled elsewhere like with VNC)
- reservedVNCPorts renamed to reservedRemotePorts (it's not just for
VNC anymore)
- QEMU_VNC_PORT_{MIN,MAX} renamed to QEMU_REMOTE_PORT_{MIN,MAX}
- port allocation unified for VNC and SPICE
This patch updates the domain and capability XML parser and formatter to
support more than one "seclabel" element for each domain and device. The
RNG schema and the tests related to this are also updated by this patch.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This patch updates the structures that store information about each
domain and each hypervisor to support multiple security labels and
drivers. It also updates all the remaining code to use the new fields.
Signed-off-by: Marcelo Cerri <mhcerri@linux.vnet.ibm.com>
This function is needed by the network driver in a later commit.
It is useful in functions like networkNotifyActualDevice and
networkReleaseActualDevice
This patch introduces the new forward mode='hostdev' along with
attribute managed. Includes updates to the network RNG and new xml
parser/formatter code.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Move the functions the parse/format, and validate PCI addresses to
their own file so they can be conveniently used in other places
besides device_conf.c
Refactoring existing code without causing any functional changes to
prepare for new code.
This patch makes the code reusable.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Change device type of a virtio channel from/to spicevmc is not a user
visible change. However, spicevmc channels use different default target
name than other virtio channels. To maintain ABI stability during this
change target name must be explicitly specified (and equal) in both
configurations.
The following config elements now support a <vlan> subelements:
within a domain: <interface>, and the <actual> subelement of <interface>
within a network: the toplevel, as well as any <portgroup>
Each vlan element must have one or more <tag id='n'/> subelements. If
there is more than one tag, it is assumed that vlan trunking is being
requested. If trunking is required with only a single tag, the
attribute "trunk='yes'" should be added to the toplevel <vlan>
element.
Some examples:
<interface type='hostdev'/>
<vlan>
<tag id='42'/>
</vlan>
<mac address='52:54:00:12:34:56'/>
...
</interface>
<network>
<name>vlan-net</name>
<vlan trunk='yes'>
<tag id='30'/>
</vlan>
<virtualport type='openvswitch'/>
</network>
<interface type='network'/>
<source network='vlan-net'/>
...
</interface>
<network>
<name>trunk-vlan</name>
<vlan>
<tag id='42'/>
<tag id='43'/>
</vlan>
...
</network>
<network>
<name>multi</name>
...
<portgroup name='production'/>
<vlan>
<tag id='42'/>
</vlan>
</portgroup>
<portgroup name='test'/>
<vlan>
<tag id='666'/>
</vlan>
</portgroup>
</network>
<interface type='network'/>
<source network='multi' portgroup='test'/>
...
</interface>
IMPORTANT NOTE: As of this patch there is no backend support for the
vlan element for *any* network device type. When support is added in
later patches, it will only be for those select network types that
support setting up a vlan on the host side, without the guest's
involvement. (For example, it will be possible to configure a vlan for
a guest connected to an openvswitch bridge, but it won't be possible
to do that for one that is connected to a standard Linux host bridge.)
Each interface has a single pointer to a filterref object. That
filterref can itself point to multiple other filterrefs, but at the
toplevel there is only one.
The parser had previously just silently overwritten earlier filterrefs
when a new one was encountered, so the interface was left with
whichever was the last filterref in the xml, ignoring all the
others. This patch logs an error when it sees more than one filterref.
Just as each physical device used by a network has a connections
counter, now each network has a connections counter which is
incremented once for each guest interface that connects using this
network.
The count is output in the live network XML, like this:
<network connections='20'>
...
</network>
It is read-only, and for informational purposes only - it isn't used
internally anywhere by libvirt.
It may be useful for management applications to know which physical
network devices are in use by guests. This information is already
available in the network objects, but wasn't output in the XML. This
patch outputs it when the INACTIVE flag isn't set (and if it's non-0).
I want to include this count in the xml output of networks, but
calling it "connections" in the XML sounds better than "usageCount", and it
would be better if the name in the XML matched the variable name.
In a few places, usageCount was being initialized to 0, but this is
unnecessary, because VIR_ALLOC_N zero-fills everything anyway.
This array was originally defined using the existing
virNetworkForwardIfDef, but that struct has a UsageCount field that
isn't used in the case of PFs. This patch just copies that struct and
removes UsageCount. It ends up being a struct with a single field, but
I left it as a struct in case we need to add other fields to it in the
future.
Until now, all attributes in a <virtualport> parameter list that were
acceptable for a particular type, were also required. There were no
optional attributes.
One of the aims of supporting <virtualport> in libvirt's virtual
networks and portgroups is to allow specifying the group-wide
parameters in the network's virtualport, and merge that with the
interface's virtualport, which will have the instance-specific info
(i.e. the interfaceid or instanceid).
Additionally, the guest's interface XML shouldn't need to know what
type of network connection will be used prior to runtime - it could be
openvswitch, 802.1Qbh, 802.1Qbg, or none of the above - but should
still be able to specify instance-specific info just in case it turns
out to be applicable.
Finally, up to now, the parser for virtualport has always generated a
random instanceid/interfaceid when appropriate, making it impossible
to leave it blank (which is what's required for virtualports within a
network/portprofile definition).
This patch modifies the parser and formatter of the <virtualport>
element in the following ways:
* because most of the attributes in a virNetDevVPortProfile are fixed
size binary data with no reserved values, there is no way to embed a
"this value wasn't specified" sentinel into the existing data. To
solve this problem, the new *_specified fields in the
virNetDevVPortProfile object that were added in a previous patch of
this series are now set when the corresponding attribute is present
during the parse.
* allow parsing/formatting a <virtualport> that has no type set. In
this case, all fields are settable, but all are also optional.
* add a GENERATE_MISSING_DEFAULTS flag to the parser - if this flag is
set and an instanceid/interfaceid is expected but not provided, a
random one will be generated. This was previously the default
behavior, but is now done only for virtualports inside an
<interface> definition, not for those in <network> or <portgroup>.
* add a REQUIRE_ALL_ATTRIBUTES flag to the parser - if this flag is
set the parser will call the new
virNetDevVPortProfileCheckComplete() functions at the end of the
parser to check for any missing attributes (based on type), and
return failure if anything is missing. This used to be default
behavior. Now it is only used for the virtualport defined inside an
interface's <actual> element (by the time you've figured out the
contents of <actual>, you should have all the necessary data to fill
in the entire virtualport)
* add a REQUIRE_TYPE flag to the parser - if this flag is set, the
parser will return an error if the virtualport has no type
attribute. This also was previously the default behavior, but isn't
needed in the case of the virtualport for a type='network' interface
(i.e. the exact type isn't yet known), or the virtualport of a
portgroup (i.e. the portgroup just has modifiers for the network's
virtualport, which *does* require a type) - in those cases, the
check will be done at domain startup, once the final virtualport is
assembled (this is handled in the next patch).
This function has several calls to increase the buffer indent by 6,
then decrease it again, then increase, then decrease. Additionally,
there were several printfs that had 6 spaces at the beginning of the
line.
virDomainActualNetDefFormat, which is called by virDomainNetDefFormat,
had similar ugliness.
This patch changes both functions to just increase the indent at the
beginning, decrease it at (well, just before*) the end, and remove all
of the occurences of 6/8 spaces at the beginning of lines.
*The indent had to be reset before the end of the function because
virDomainDeviceInfoFormat assumes a 0 indent and is called from many
other places, and I didn't want to do an overhaul of every caller of
that function. A separate patch to switch all of domain_conf.c would
be a useful exercise, but my current goal is unrelated to that, so
I'll leave it for another day.
There was an error: label that simply did "return ret", but ret was
defaulted to -1, and was never used other than setting it manually to
0 just before a non-error return. Aside from this, some of the error
return paths used "goto error" and others used "return ret".
This patch removes ret and the error: label, and makes all error
returns just consistently do "return -1".
virtPortProfile is now used by 4 different types of network devices
(NETWORK, BRIDGE, DIRECT, and HOSTDEV), and it's getting cumbersome to
replicate so much code in 4 different places just because each type
has the virtPortProfile in a slightly different place. This patch puts
a single virtPortProfile in a common place (outside the type-specific
union) in both virDomainNetDef and virDomainActualNetDef, and adjusts
the parse and format code (and the few other places where it is used)
accordingly.
Note that when a <virtualport> element is found, the parse functions
verify that the interface is of a type that supports one, otherwise an
error is generated (CONFIG_UNSUPPORTED in the case of <interface>, and
INTERNAL in the case of <actual>, since the contents of <actual> are
always generated by libvirt itself).
virNetDevVPortProfile has (had) a type field that can be set to one of
several values, and a union of several structs, one for each
type. When a domain's interface object is of type "network", the
domain config may not know beforehand which type of virtualport is
going to be provided in the actual device handed down from the network
driver at runtime, but may want to set some values in the virtualport
that may or may not be used, depending on the type. To support this
usage, this patch replaces the union of structs with toplevel fields
in the struct, making it possible for all of the fields to be set at
the same time.
As the consensus in:
https://www.redhat.com/archives/libvir-list/2012-July/msg01692.html,
this patch is to destroy conf/virdomainlist.[ch], folding the
helpers into conf/domain_conf.[ch].
* src/Makefile.am:
- Various indention fixes incidentally
- Add macro DATATYPES_SOURCES (datatypes.[ch])
- Link datatypes.[ch] for libvirt_lxc
* src/conf/domain_conf.c:
- Move all the stuffs from virdomainlist.c into it
- Use virUnrefDomain and virUnrefDomainSnapshot instead of
virDomainFree and virDomainSnapshotFree, which are defined
in libvirt.c, and we don't want to link to it.
- Remove "if" before "free" the object, as virObjectUnref
is in the list "useless_free_options".
* src/conf/domain_conf.h:
- Move all the stuffs from virdomainlist.h into it
- s/LIST_FILTER/LIST_DOMAINS_FILTER/
* src/libxl/libxl_driver.c:
- s/LIST_FILTER/LIST_DOMAINS_FILTER/
- no (include "virdomainlist.h")
* src/libxl/libxl_driver.c: Likewise
* src/lxc/lxc_driver.c: Likewise
* src/openvz/openvz_driver.c: Likewise
* src/parallels/parallels_driver.c: Likewise
* src/qemu/qemu_driver.c: Likewise
* src/test/test_driver.c: Likewise
* src/uml/uml_driver.c: Likewise
* src/vbox/vbox_tmpl.c: Likewise
* src/vmware/vmware_driver.c: Likewise
* tools/virsh-domain-monitor.c: Likewise
* tools/virsh.c: Likewise
The meat of this patch is just moving the calls to
virNWFilterRegisterCallbackDriver from each hypervisor's "register"
function into its "initialize" function. The rest is just code
movement to allow that, and a new virNWFilterUnRegisterCallbackDriver
function to undo what the register function does.
The long explanation:
There is an array in nwfilter called callbackDrvArray that has
pointers to a table of functions for each hypervisor driver that are
called by nwfilter. One of those function pointers is to a function
that will lock the hypervisor driver. Entries are added to the table
by calling each driver's "register" function, which happens quite
early in libvirtd's startup.
Sometime later, each driver's "initialize" function is called. This
function allocates a driver object and stores a pointer to it in a
static variable that was previously initialized to NULL. (and here's
the important part...) If the "initialize" function fails, the driver
object is freed, and that pointer set back to NULL (but the entry in
nwfilter's callbackDrvArray is still there).
When the "lock the driver" function mentioned above is called, it
assumes that the driver was successfully loaded, so it blindly tries
to call virMutexLock on "driver->lock".
BUT, if the initialize never happened, or if it failed, "driver" is
NULL. And it just happens that "lock" is always the first field in
driver so it is also NULL.
Boom.
To fix this, the call to virNWFilterRegisterCallbackDriver for each
driver shouldn't be called until the end of its (*already guaranteed
successful*) "initialize" function, not during its "register" function
(which is currently the case). This implies that there should also be
a virNWFilterUnregisterCallbackDriver() function that is called in a
driver's "shutdown" function (although in practice, that function is
currently never called).
Otherwise, in locations like virobject.c where PROBE is used,
for certain configure options, the compiler warns:
util/virobject.c:110:1: error: 'intptr_t' undeclared (first use in this function)
As long as we are making this header always available, we can
clean up several other files.
* src/internal.h (includes): Pull in <stdint.h>.
* src/conf/nwfilter_conf.h: Rely on internal.h.
* src/storage/storage_backend.c: Likewise.
* src/storage/storage_backend.h: Likewise.
* src/util/cgroup.c: Likewise.
* src/util/sexpr.h: Likewise.
* src/util/virhashcode.h: Likewise.
* src/util/virnetdevvportprofile.h: Likewise.
* src/util/virnetlink.h: Likewise.
* src/util/virrandom.h: Likewise.
* src/vbox/vbox_driver.c: Likewise.
* src/xenapi/xenapi_driver.c: Likewise.
* src/xenapi/xenapi_utils.c: Likewise.
* src/xenapi/xenapi_utils.h: Likewise.
* src/xenxs/xenxs_private.h: Likewise.
* tests/storagebackendsheepdogtest.c: Likewise.
An ESX server has one or more PhysicalNics that represent the actual
hardware NICs. Those can be listed via the interface driver.
A libvirt virtual network is mapped to a HostVirtualSwitch. On the
physical side a HostVirtualSwitch can be connected to PhysicalNics.
On the virtual side a HostVirtualSwitch has HostPortGroups that are
mapped to libvirt virtual network's portgroups. Typically there is
HostPortGroups named 'VM Network' that is used to connect virtual
machines to a HostVirtualSwitch. A second HostPortGroup typically
named 'Management Network' is used to connect the hypervisor itself
to the HostVirtualSwitch. This one is not mapped to a libvirt virtual
network's portgroup. There can be more HostPortGroups than those
typical two on a HostVirtualSwitch.
+---------------+-------------------+
...---| | | +-------------+
| HostPortGroup | |---| PhysicalNic |
| VM Network | | | vmnic0 |
...---| | | +-------------+
+---------------+ HostVirtualSwitch |
| vSwitch0 |
+---------------+ |
| HostPortGroup | |
...---| Management | |
| Network | |
+---------------+-------------------+
The virtual counterparts of the PhysicalNic is the HostVirtualNic for
the hypervisor and the VirtualEthernetCard for the virtual machines
that are grouped into HostPortGroups.
+---------------------+ +---------------+---...
| VirtualEthernetCard |---| |
+---------------------+ | HostPortGroup |
+---------------------+ | VM Network |
| VirtualEthernetCard |---| |
+---------------------+ +---------------+
|
+---------------+
+---------------------+ | HostPortGroup |
| HostVirtualNic |---| Management |
+---------------------+ | Network |
+---------------+---...
The currently implemented network driver can list, define and undefine
HostVirtualSwitches including HostPortGroups for virtual machines.
Existing HostVirtualSwitches cannot be edited yet. This will be added
in a followup patch.
Switch virDomainObjPtr to use the virObject APIs for reference
counting. The main change is that virObjectUnref does not return
the reference count, merely a bool indicating whether the object
still has any refs left. Checking the return value is also not
mandatory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This converts the following public API datatypes to use the
virObject infrastructure:
virConnectPtr
virDomainPtr
virDomainSnapshotPtr
virInterfacePtr
virNetworkPtr
virNodeDevicePtr
virNWFilterPtr
virSecretPtr
virStreamPtr
virStorageVolPtr
virStoragePoolPtr
The code is significantly simplified, since the mutex in the
virConnectPtr object now only needs to be held when accessing
the per-connection virError object instance. All other operations
are completely lock free.
* src/datatypes.c, src/datatypes.h, src/libvirt.c: Convert
public datatypes to use virObject
* src/conf/domain_event.c, src/phyp/phyp_driver.c,
src/qemu/qemu_command.c, src/qemu/qemu_migration.c,
src/qemu/qemu_process.c, src/storage/storage_driver.c,
src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
tests/qemuxml2argvtest.c, tests/qemuxmlnstest.c,
tests/sexpr2xmltest.c, tests/xmconfigtest.c: Convert
to use virObjectUnref/virObjectRef
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Commit ba226d334a tried to fix crash of
the daemon when a domain with an open console was destroyed. The fix was
wrong as it tried to remove the callback also when the stream was
aborted, where at that point the fd stream driver was already freed and
removed.
This patch clears the callbacks with a helper right before the hash is
freed, so that it doesn't interfere with other codepaths where the
stream object is freed.
* src/conf/domain_conf.c:
- Add virDomainControllerFind to find controller device by type
and index.
- Add virDomainControllerRemove to remove the controller device
from maintained controler list.
* src/conf/domain_conf.h:
- Declare the two new helpers.
* src/libvirt_private.syms:
- Expose private symbols for the two new helpers.
* src/qemu/qemu_driver.c:
- Support attach/detach controller device persistently
* src/qemu/qemu_hotplug.c:
- Use the two helpers to simplify the codes.
The access, birth, modification and change times are added to
storage volumes and corresponding xml representations. This
shows up in the XML in this format:
<timestamps>
<atime>1341933637.027319099</atime>
<mtime>1341933637.027319099</mtime>
</timestamps>
Signed-off-by: Eric Blake <eblake@redhat.com>
capability.rng: Guest features can be in any order.
nodedev.rng: Added <driver> element, <capability> phys_function and
virt_functions for PCI devices.
storagepool.rng: Owner or group ID can be -1.
schema tests: New capabilities and nodedev files; changed owner and
group to -1 in pool-dir.xml.
storage_conf: Print uid_t and gid_t as signed to storage pool XML.
This patch adds helpers that validate domain's device configuration.
This will be needed later on to verify devices being hot-plugged to
guests. If the guest has no USB bus, then it's not valid to plug a USB
device to that guest.
Libvirt adds a USB controller to the guest even if the user does not
specify any in the XML. This is due to back-compat reasons.
To allow disabling USB for a guest this patch adds a new USB controller
type "none" that disables USB support for the guest.
Parallels Cloud Server is a cloud-ready virtualization
solution that allows users to simultaneously run multiple virtual
machines and containers on the same physical server.
More information can be found here: http://www.parallels.com/products/pcs/
Also beta version of Parallels Cloud Server can be downloaded there.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Any time we have a string with no % passed through gettext, a
translator can inject a % to cause a stack overread. When there
is nothing to format, it's easier to ask for a string that cannot
be used as a formatter, by using a trivial "%s" format instead.
In the past, we have used --disable-nls to catch some of the
offenders, but that doesn't get run very often, and many more
uses have crept in. Syntax check to the rescue!
The syntax check can catch uses such as
virReportError(code,
_("split "
"string"));
by using a sed script to fold context lines into one pattern
space before checking for a string without %.
This patch is just mechanical insertion of %s; there are probably
several messages touched by this patch where we would be better
off giving the user more information than a fixed string.
* cfg.mk (sc_prohibit_diagnostic_without_format): New rule.
* src/datatypes.c (virUnrefConnect, virGetDomain)
(virUnrefDomain, virGetNetwork, virUnrefNetwork, virGetInterface)
(virUnrefInterface, virGetStoragePool, virUnrefStoragePool)
(virGetStorageVol, virUnrefStorageVol, virGetNodeDevice)
(virGetSecret, virUnrefSecret, virGetNWFilter, virUnrefNWFilter)
(virGetDomainSnapshot, virUnrefDomainSnapshot): Add %s wrapper.
* src/lxc/lxc_driver.c (lxcDomainSetBlkioParameters)
(lxcDomainGetBlkioParameters): Likewise.
* src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML)
(virDomainDiskDefParseXML, virDomainGraphicsDefParseXML):
Likewise.
* src/conf/network_conf.c (virNetworkDNSHostsDefParseXML)
(virNetworkDefParseXML): Likewise.
* src/conf/nwfilter_conf.c (virNWFilterIsValidChainName):
Likewise.
* src/conf/nwfilter_params.c (virNWFilterVarValueCreateSimple)
(virNWFilterVarAccessParse): Likewise.
* src/libvirt.c (virDomainSave, virDomainSaveFlags)
(virDomainRestore, virDomainRestoreFlags)
(virDomainSaveImageGetXMLDesc, virDomainSaveImageDefineXML)
(virDomainCoreDump, virDomainGetXMLDesc)
(virDomainMigrateVersion1, virDomainMigrateVersion2)
(virDomainMigrateVersion3, virDomainMigrate, virDomainMigrate2)
(virStreamSendAll, virStreamRecvAll)
(virDomainSnapshotGetXMLDesc): Likewise.
* src/nwfilter/nwfilter_dhcpsnoop.c (virNWFilterSnoopReqLeaseDel)
(virNWFilterDHCPSnoopReq): Likewise.
* src/openvz/openvz_driver.c (openvzUpdateDevice): Likewise.
* src/openvz/openvz_util.c (openvzKBPerPages): Likewise.
* src/qemu/qemu_cgroup.c (qemuSetupCgroup): Likewise.
* src/qemu/qemu_command.c (qemuBuildHubDevStr, qemuBuildChrChardevStr)
(qemuBuildCommandLine): Likewise.
* src/qemu/qemu_driver.c (qemuDomainGetPercpuStats): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Likewise.
* src/rpc/virnetsaslcontext.c (virNetSASLSessionGetIdentity):
Likewise.
* src/rpc/virnetsocket.c (virNetSocketNewConnectUNIX)
(virNetSocketSendFD, virNetSocketRecvFD): Likewise.
* src/storage/storage_backend_disk.c
(virStorageBackendDiskBuildPool): Likewise.
* src/storage/storage_backend_fs.c
(virStorageBackendFileSystemProbe)
(virStorageBackendFileSystemBuild): Likewise.
* src/storage/storage_backend_rbd.c
(virStorageBackendRBDOpenRADOSConn): Likewise.
* src/storage/storage_driver.c (storageVolumeResize): Likewise.
* src/test/test_driver.c (testInterfaceChangeBegin)
(testInterfaceChangeCommit, testInterfaceChangeRollback):
Likewise.
* src/vbox/vbox_tmpl.c (vboxListAllDomains): Likewise.
* src/xenxs/xen_sxpr.c (xenFormatSxprDisk, xenFormatSxpr):
Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetUUID, xenFormatXMDisk)
(xenFormatXM): Likewise.
Per the FSF address could be changed from time to time, and GNU
recommends the following now: (http://www.gnu.org/licenses/gpl-howto.html)
You should have received a copy of the GNU General Public License
along with Foobar. If not, see <http://www.gnu.org/licenses/>.
This patch removes the explicit FSF address, and uses above instead
(of course, with inserting 'Lesser' before 'General').
Except a bunch of files for security driver, all others are changed
automatically, the copyright for securify files are not complete,
that's why to do it manually:
src/security/security_selinux.h
src/security/security_driver.h
src/security/security_selinux.c
src/security/security_apparmor.h
src/security/security_apparmor.c
src/security/security_driver.c
This patch brings support to manage sheepdog pools and volumes to libvirt.
It uses the "collie" command-line utility that comes with sheepdog for that.
A sheepdog pool in libvirt maps to a sheepdog cluster.
It needs a host and port to connect to, which in most cases
is just going to be the default of localhost on port 7000.
A sheepdog volume in libvirt maps to a sheepdog vdi.
To create one specify the pool, a name and the capacity.
Volumes can also be resized later.
In the volume XML the vdi name has to be put into the <target><path>.
To use the volume as a disk source for virtual machines specify
the vdi name as "name" attribute of the <source>.
The host and port information from the pool are specified inside the host tag.
<disk type='network'>
...
<source protocol="sheepdog" name="vdi_name">
<host name="localhost" port="7000"/>
</source>
</disk>
To work right this patch parses the output of collie,
so it relies on the raw output option. There recently was a bug which caused
size information to be reported wrong. This is fixed upstream already and
will be in the next release.
Signed-off-by: Sebastian Wiedenroth <wiedi@frubar.net>
Introduce new members in the virMacAddr 'class'
- virMacAddrSet: set virMacAddr from a virMacAddr
- virMacAddrSetRaw: setting virMacAddr from raw 6 byte MAC address buffer
- virMacAddrGetRaw: writing virMacAddr into raw 6 byte MAC address buffer
- virMacAddrCmp: comparing two virMacAddr
- virMacAddrCmpRaw: comparing a virMacAddr with a raw 6 byte MAC address buffer
then replace raw MAC addresses by replacing
- 'unsigned char *' with virMacAddrPtr
- 'unsigned char ... [VIR_MAC_BUFLEN]' with virMacAddr
and introduce usage of above functions where necessary.
When the guest changes its memory balloon applications may want
to know what the new value is, without having to periodically
poll on XML / domain info. Introduce a "balloon change" event
to let apps see this
* include/libvirt/libvirt.h.in: Define the
virConnectDomainEventBalloonChangeCallback callback
and VIR_DOMAIN_EVENT_ID_BALLOON_CHANGE constant
* python/libvirt-override-virConnect.py,
python/libvirt-override.c: Wire up helpers for new event
* daemon/remote.c: Helper for serializing balloon event
* examples/domain-events/events-c/event-test.c,
examples/domain-events/events-python/event-test.py: Add
example of balloon event usage
* src/conf/domain_event.c, src/conf/domain_event.h: Handling
of balloon events
* src/remote/remote_driver.c: Add handler of balloon events
* src/remote/remote_protocol.x: Define wire protocol for
balloon events
* src/remote_protocol-structs: Likewise.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Per the typical use of libvirt is to fork the qemu process with
qemu:qemu. Setting the pool permission mode as 0700 by default
will prevent the guest start with permission reason.
Define macro for the default pool and vol permission modes
incidentally.
The s390(x) architecture doesn't feature a PCI bus. For the purpose of
supporting virtio devices a virtual bus called virtio-s390 is used.
A new address type VIR_DOMAIN_DEVICE_ADDRESS_TYPE_VIRTIO_S390 is used to
distinguish the virtio devices on s390 from PCI-based virtio devices.
V3 Change: updated QEMU_CAPS_VIRTIO_S390 to fit upstream.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reported by Jason Helfman as a build-breaker on FreeBSD.
* src/conf/domain_conf.c (virDomainFSDefParseXML): Use POSIX
spelling.
* src/openvz/openvz_conf.c (openvzReadFSConf): Likewise.
Below patch fixes this coverity report:
/libvirt/src/conf/nwfilter_conf.c:382:
leaked_storage: Variable "varAccess" going out of scope leaks the storage it points to.
If the user specified invalid protocol type in a network's SRV record
the error path ended up in freeing uninitialized pointers causing a
daemon crash.
*network_conf.c: virNetworkDNSSrvDefParseXML(): initialize local
variables
virConnectDomainEventRegisterAny() takes a domain as an argument.
So it should be possible to register the same event (be it
VIR_DOMAIN_EVENT_ID_LIFECYCLE for example) for two different domains.
That is, we need to take domain into account when searching for
duplicate event being already registered.
Currently you can configure LXC to bind a host directory to
a guest directory, but not to bind a guest directory to a
guest directory. While the guest container init could do
this itself, allowing it in the libvirt XML means a stricter
SELinux policy can be written
Introduce a new syntax for filesystems to allow use of a RAM
filesystem
<filesystem type='ram'>
<source usage='10' units='MiB'/>
<target dir='/mnt'/>
</filesystem>
The usage units default to KiB to limit consumption of host memory.
* docs/formatdomain.html.in: Document new syntax
* docs/schemas/domaincommon.rng: Add new attributes
* src/conf/domain_conf.c: Parsing/formatting of RAM filesystems
* src/lxc/lxc_container.c: Mounting of RAM filesystems
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Wraps the conversion from 'char *name' to virDomainSnapshotPtr in
a reusable manner.
* src/conf/virdomainlist.h (virDomainListSnapshots): New declaration.
* src/conf/virdomainlist.c (virDomainListSnapshots): Implement it.
* src/libvirt_private.syms (virdomainlist.h): Export it.
It turns out that one-bit filtering makes it hard to select the inverse
set, so it is easier to provide filtering groups. For back-compat,
omitting all bits within a group means the group is not used for
filtering, and by definition of a group (each snapshot matches exactly
one bit within the group, and the set of bits in the group covers all
snapshots), selecting all bits also makes the group useless.
Unfortunately, virDomainSnapshotListChildren defined the bit
VIR_DOMAIN_SNAPSHOT_LIST_DESCENDANTS as an expansion rather than a
filter, so we cannot make it part of a filter group, so that bit
(and its counterpart VIR_DOMAIN_SNAPSHOT_LIST_ROOTS for
virDomainSnapshotList) remains a single control bit.
* include/libvirt/libvirt.h.in (virDomainSnapshotListFlags): Add a
couple more flags.
* src/libvirt.c (virDomainSnapshotNum)
(virDomainSnapshotNumChildren): Document them.
(virDomainSnapshotListNames, virDomainSnapshotListChildrenNames):
Likewise, and add thread-safety caveats.
* src/conf/virdomainlist.h (VIR_DOMAIN_SNAPSHOT_FILTERS_*): New
convenience macros.
* src/conf/domain_conf.c (virDomainSnapshotObjListCopyNames)
(virDomainSnapshotObjListCount): Support the new flags.
Until now, it was possible to crash libvirtd when defining domain with
channel device with missing source element.
When creating new virDomainChrDef, target.port is set to -1, but
unfortunately it is an union with addresses that virDomainChrDefFree
tries to free in case the deviceType is channel. Having the port set
to -1 is intended, however the cleanest way to get around the problems
with the crash seems to be renumbering the VIR_DOMAIN_CHR_CHANNEL_
target types to cover new NONE type (with value 0) being the default
(no target type yet).
Another case where we can do the same amount of work with fewer
lines of redundant code, which will make adding new filters easier.
* src/conf/domain_conf.c (virDomainSnapshotNameData): Adjust
struct.
(virDomainSnapshotObjListCount): Delete, now taken care of...
(virDomainSnapshotObjListCopyNames): ...here.
(virDomainSnapshotObjListGetNames): Adjust caller to handle
counting.
(virDomainSnapshotObjListNum): Simplify.
Now that domain listing is a thin wrapper around child listing,
it's easier to have a common entry point. This restores the
hashForEach optimization lost in the previous patch when there
are no snapshots being filtered out of the entire list.
* src/conf/domain_conf.h (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListNum): Add parameter.
(virDomainSnapshotObjListGetNamesFrom)
(virDomainSnapshotObjListNumFrom): Delete.
* src/libvirt_private.syms (domain_conf.h): Drop deleted functions.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames):
Merge, and (re)add an optimization.
* src/qemu/qemu_driver.c (qemuDomainUndefineFlags)
(qemuDomainSnapshotListNames, qemuDomainSnapshotNum)
(qemuDomainSnapshotListChildrenNames)
(qemuDomainSnapshotNumChildren): Update callers.
* src/qemu/qemu_migration.c (qemuMigrationIsAllowed): Likewise.
* src/conf/virdomainlist.c (virDomainListPopulate): Likewise.
This idea was first suggested by Daniel Veillard here:
https://www.redhat.com/archives/libvir-list/2011-October/msg00353.html
Now that I am about to add more complexity to snapshot listing, it
makes sense to avoid code duplication and special casing for domain
listing (all snapshots) vs. snapshot listing (descendants); adding
a metaroot reduces the number of code lines by having the domain
listing turn into a descendant listing of the metaroot.
Note that this has one minor pessimization - if we are going to list
ALL snapshots without filtering, then virHashForeach is more efficient
than recursing through the child relationships; restoring that minor
optimization will occur in the next patch.
* src/conf/domain_conf.h (_virDomainSnapshotObj)
(_virDomainSnapshotObjList): Repurpose some fields.
(virDomainSnapshotDropParent): Drop unused parameter.
* src/conf/domain_conf.c (virDomainSnapshotObjListGetNames)
(virDomainSnapshotObjListCount): Simplify.
(virDomainSnapshotFindByName, virDomainSnapshotSetRelations)
(virDomainSnapshotDropParent): Match new field semantics.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML)
(qemuDomainSnapshotReparentChildren, qemuDomainSnapshotDelete):
Adjust clients.
This patch adds common code to list domains in fashion used by
virListAllDomains with all currently supported flags. The header file
also contains macros that group filters together that are used to
shorten filter conditions.
This patch stores existence of the image in the object. At start of the
daemon the state is checked and then updated in key moments in domain
lifecycle.
The goal of this patch is to prepare for support for multiple IP
addresses per interface in the DHCP snooping code.
Move the code for the IP address map that maps interface names to
IP addresses into their own file. Rename the functions on the way
but otherwise leave the code as-is. Initialize this new layer
separately before dependent layers (iplearning, dhcpsnooping)
and shut it down after them.
This patch adds DHCP snooping support to libvirt. The learning method for
IP addresses is specified by setting the "CTRL_IP_LEARNING" variable to one of
"any" [default] (existing IP learning code), "none" (static only addresses)
or "dhcp" (DHCP snooping).
Active leases are saved in a lease file and reloaded on restart or HUP.
The following interface XML activates and uses the DHCP snooping:
<interface type='bridge'>
<source bridge='virbr0'/>
<filterref filter='clean-traffic'>
<parameter name='CTRL_IP_LEARNING' value='dhcp'/>
</filterref>
</interface>
All filters containing the variable 'IP' are automatically adjusted when
the VM receives an IP address via DHCP. However, multiple IP addresses per
interface are silently ignored in this patch, thus only supporting one IP
address per interface. Multiple IP address support is added in a later
patch in this series.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
and use it for virDomainParseMemory. This allows to parse arbitrary
scaled value, not only memory related values as needed for the
filesystem limits code following later in this series.
This patch adds support for a new storage backend with RBD support.
RBD is the RADOS Block Device and is part of the Ceph distributed storage
system.
It comes in two flavours: Qemu-RBD and Kernel RBD, this storage backend only
supports Qemu-RBD, thus limiting the use of this storage driver to Qemu only.
To function this backend relies on librbd and librados being present on the
local system.
The backend also supports Cephx authentication for safe authentication with
the Ceph cluster.
For storing credentials it uses the built-in secret mechanism of libvirt.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues
Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):
Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Delete virDomainEventStateDeregisterConn
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_driver.c, src/uml/uml_driver.c: Remove
calls to virDomainEventStateDeregisterConn
This patch adds support for the recent ipset iptables extension
to libvirt's nwfilter subsystem. Ipset allows to maintain 'sets'
of IP addresses, ports and other packet parameters and allows for
faster lookup (in the order of O(1) vs. O(n)) and rule evaluation
to achieve higher throughput than what can be achieved with
individual iptables rules.
On the command line iptables supports ipset using
iptables ... -m set --match-set <ipset name> <flags> -j ...
where 'ipset name' is the name of a previously created ipset and
flags is a comma-separated list of up to 6 flags. Flags use 'src' and 'dst'
for selecting IP addresses, ports etc. from the source or
destination part of a packet. So a concrete example may look like this:
iptables -A INPUT -m set --match-set test src,src -j ACCEPT
Since ipset management is quite complex, the idea was to leave ipset
management outside of libvirt but still allow users to reference an ipset.
The user would have to make sure the ipset is available once the VM is
started so that the iptables rule(s) referencing the ipset can be created.
Using XML to describe an ipset in an nwfilter rule would then look as
follows:
<rule action='accept' direction='in'>
<all ipset='test' ipsetflags='src,src'/>
</rule>
The two parameters on the command line are also the two distinct XML attributes
'ipset' and 'ipsetflags'.
FYI: Here is the man page for ipset:
https://ipset.netfilter.org/ipset.man.html
Regards,
Stefan
The uhci1, uhci2, uhci3 companion controllers for ehci1 must
have a master start port set. Since this value is predictable
we should set it automatically if the app does not supply it
The virDomainDeviceInfoIsSet API was only checking if an
address or alias was set in the struct. Thus if only a
rom bar setting / filename, boot index, or USB master
value was set, they could be accidentally dropped when
formatting XML
Detected by valgrind. Leaks are introduced in commit 122fa379.
src/conf/storage_conf.c: fix memory leaks.
How to reproduce?
$ make && make -C tests check TESTS=storagepoolxml2xmltest
$ cd tests && valgrind -v --leak-check=full ./storagepoolxml2xmltest
actual result:
==28571== LEAK SUMMARY:
==28571== definitely lost: 40 bytes in 5 blocks
==28571== indirectly lost: 0 bytes in 0 blocks
==28571== possibly lost: 0 bytes in 0 blocks
==28571== still reachable: 1,054 bytes in 21 blocks
==28571== suppressed: 0 bytes in 0 blocks
Signed-off-by: Alex Jia <ajia@redhat.com>
No useful error was being reported when an invalid character device
target type is specified in the domainXML. E.g.
...
<console type="pty">
<source path="/dev/pts/2"/>
<target type="kvm" port="0"/>
</console>
...
resulted in
error: Failed to define domain from x.xml
error: An error occurred, but the cause is unknown
With this small patch, the error is more helpful
error: Failed to define domain from x.xml
error: XML error: unknown target type 'kvm' specified for character device
<vcpu> is not an optional node. The value for its 'placement'
actually always defaults to 'static' in the underlying codes.
(Even no 'cpuset' and 'placement' is specified, the domain
process will be pinned to all the available pCPUs).
Though numad will manage the memory allocation of task dynamically,
it wants management application (libvirt) to pre-set the memory
policy according to the advisory nodeset returned from querying numad,
(just like pre-bind CPU nodeset for domain process), and thus the
performance could benefit much more from it.
This patch introduces new XML tag 'placement', value 'auto' indicates
whether to set the memory policy with the advisory nodeset from numad,
and its value defaults to the value of <vcpu> placement, or 'static'
if 'nodeset' is specified. Example of the new XML tag's usage:
<numatune>
<memory placement='auto' mode='interleave'/>
</numatune>
Just like what current "numatune" does, the 'auto' numa memory policy
setting uses libnuma's API too.
If <vcpu> "placement" is "auto", and <numatune> is not specified
explicitly, a default <numatume> will be added with "placement"
set as "auto", and "mode" set as "strict".
The following XML can now fully drive numad:
1) <vcpu> placement is 'auto', no <numatune> is specified.
<vcpu placement='auto'>10</vcpu>
2) <vcpu> placement is 'auto', no 'placement' is specified for
<numatune>.
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='interleave'/>
</numatune>
And it's also able to control the CPU placement and memory policy
independently. e.g.
1) <vcpu> placement is 'auto', and <numatune> placement is 'static'
<vcpu placement='auto'>10</vcpu>
<numatune>
<memory mode='strict' nodeset='0-10,^7'/>
</numatune>
2) <vcpu> placement is 'static', and <numatune> placement is 'auto'
<vcpu placement='static' cpuset='0-24,^12'>10</vcpu>
<numatune>
<memory mode='interleave' placement='auto'/>
</numatume>
A follow up patch will change the XML formatting codes to always output
'placement' for <vcpu>, even it's 'static'.
qemu's behavior in this case is to change the spice server behavior to
require secure connection to any channel not otherwise specified as
being in plaintext mode. libvirt doesn't currently allow requesting this
(via plaintext-channel=<channel name>).
RHBZ: 819499
Signed-off-by: Alon Levy <alevy@redhat.com>
The previous storage patch missed an instance affected by the struct
member rename. It also had some botched whitespace detected by
'make check'.
* src/storage/storage_backend_iscsi.c
(virStorageBackendISCSIFindPoolSources): Adjust to new struct.
* src/conf/storage_conf.c (virStoragePoolSourceFormat): Fix
indentation.
It doesn't break out the "for" loop even if duplicate pool is
found, and thus the "matchpool" could be overriden as NULL again
if there is different pool afterwards.
To address the problem in libvirt-user list:
https://www.redhat.com/archives/libvirt-users/2012-April/msg00150.html
The current storage pools for NFS and iSCSI only require one host to
connect to. Future storage pools like RBD and Sheepdog will require
multiple hosts.
This patch allows multiple source hosts and rewrites the current
storage drivers.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
More bug extermination in the category of:
Error: CHECKED_RETURN:
/libvirt/src/conf/network_conf.c:595:
check_return: Calling function "virAsprintf" without checking return value (as is done elsewhere 515 out of 543 times).
/libvirt/src/qemu/qemu_process.c:2780:
unchecked_value: No check of the return value of "virAsprintf(&msg, "was paused (%s)", virDomainPausedReasonTypeToString(reason))".
/libvirt/tests/commandtest.c:809:
check_return: Calling function "setsid" without checking return value (as is done elsewhere 4 out of 5 times).
/libvirt/tests/commandtest.c:830:
unchecked_value: No check of the return value of "virTestGetDebug()".
/libvirt/tests/commandtest.c:831:
check_return: Calling function "virTestGetVerbose" without checking return value (as is done elsewhere 41 out of 42 times).
/libvirt/tests/commandtest.c:833:
check_return: Calling function "virInitialize" without checking return value (as is done elsewhere 18 out of 21 times).
One note about the error in commandtest line 809: setsid() seems to fail when running the test -- could be removed ?
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:390:
var_assigned: Assigning: "varValue" = null return value from "virHashLookup".
/libvirt/src/conf/nwfilter_params.c:392:
dereference: Dereferencing a pointer that might be null "varValue" when calling "virNWFilterVarValueGetNthValue".
/libvirt/src/conf/nwfilter_params.c:399:
dereference: Dereferencing a pointer that might be null "tmp" when calling "virNWFilterVarValueGetNthValue".
This patch addresses the following coverity findings:
/libvirt/src/conf/nwfilter_params.c:157:
deref_parm: Directly dereferencing parameter "val".
/libvirt/src/conf/nwfilter_params.c:473:
negative_returns: Using variable "iterIndex" as an index to array "res->iter".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2891:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-d 01:80:c2:00:00:00 ")".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:2894:
unchecked_value: No check of the return value of "virAsprintf(&protostr, "-p 0x%04x ", l3_protocols[protoidx].attr)".
/libvirt/src/nwfilter/nwfilter_ebiptables_driver.c:3590:
var_deref_op: Dereferencing null variable "inst".
In order to track a block copy job across libvirtd restarts, we
need to save internal XML that tracks the name of the file
holding the mirror. Displaying this name in dumpxml might also
be useful to the user, even if we don't yet have a way to (re-)
start a domain with mirroring enabled up front. This is done
with a new <mirror> sub-element to <disk>, as in:
<disk type='file' device='disk'>
<driver name='qemu' type='raw'/>
<source file='/var/lib/libvirt/images/original.img'/>
<mirror file='/var/lib/libvirt/images/copy.img' format='qcow2' ready='yes'/>
...
</disk>
For now, the element is output-only, in live domains; it is ignored
when defining a domain or hot-plugging a disk (since those contexts
use VIR_DOMAIN_XML_INACTIVE in parsing). The 'ready' attribute appears
when libvirt knows that the job has changed from the initial pulling
phase over to the mirroring phase, although absence of the attribute
is not a sure indicator of the current phase. If we come up with a way
to make qemu start with mirroring enabled, we can relax the xml
restriction, and allow <mirror> (but not attribute 'ready') on input.
Testing active-only XML meant tweaking the testsuite slightly, but it
was worth it.
* docs/schemas/domaincommon.rng (diskspec): Add diskMirror.
* docs/formatdomain.html.in (elementsDisks): Document it.
* src/conf/domain_conf.h (_virDomainDiskDef): New members.
* src/conf/domain_conf.c (virDomainDiskDefFree): Clean them.
(virDomainDiskDefParseXML): Parse them, but only internally.
(virDomainDiskDefFormat): Output them.
* tests/qemuxml2argvdata/qemuxml2argv-disk-mirror.xml: New test file.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-mirror.xml: Likewise.
* tests/qemuxml2xmltest.c (testInfo): Alter members.
(testCompareXMLToXMLHelper): Allow more test control.
(mymain): Run new test.
I almost copied-and-pasted some redundant () into my new code,
and figured a general cleanup prereq patch would be better instead.
No semantic change.
* src/conf/domain_conf.c (virDomainLeaseDefParseXML)
(virDomainDiskDefParseXML, virDomainFSDefParseXML)
(virDomainActualNetDefParseXML, virDomainNetDefParseXML)
(virDomainGraphicsDefParseXML, virDomainVideoAccelDefParseXML)
(virDomainVideoDefParseXML, virDomainHostdevFind)
(virDomainControllerInsertPreAlloced, virDomainDefParseXML)
(virDomainObjParseXML, virDomainCpuSetFormat)
(virDomainCpuSetParse, virDomainDiskDefFormat)
(virDomainActualNetDefFormat, virDomainNetDefFormat)
(virDomainTimerDefFormat, virDomainGraphicsListenDefFormat)
(virDomainDefFormatInternal, virDomainNetGetActualHostdev)
(virDomainNetGetActualBandwidth, virDomainGraphicsGetListen):
Reduce extra ().
https://bugzilla.redhat.com/show_bug.cgi?id=617711 reported that
even with my recent patched to allow <memory unit='G'>1</memory>,
people can still get away with trying <memory>1G</memory> and
silently get <memory unit='KiB'>1</memory> instead. While
virt-xml-validate catches the error, our C parser did not.
Not to mention that it's always fun to fix bugs while reducing
lines of code. :)
* src/conf/domain_conf.c (virDomainParseMemory): Check for parse error.
(virDomainDefParseXML): Avoid strtoll.
* src/conf/storage_conf.c (virStorageDefParsePerms): Likewise.
* src/util/xml.c (virXPathLongBase, virXPathULongBase)
(virXPathULongLong, virXPathLongLong): Likewise.
Fix the support for trusted DHCP server in the ebtables code's
hard-coded function applying DHCP only filtering rules:
Rather than using a char * use the more flexible
virNWFilterVarValuePtr that contains the trusted DHCP server(s)
IP address. Process all entries.
Since all callers so far provided NULL as parameter, no changes
are necessary in any other code.
The below patch fixes the following memory leak.
==20624== 24 bytes in 2 blocks are definitely lost in loss record 532 of 1,867
==20624== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==20624== by 0x38EC27FC01: strdup (strdup.c:43)
==20624== by 0x4EB6BA3: virDomainChrSourceDefCopy (domain_conf.c:1122)
==20624== by 0x495D76: qemuProcessFindCharDevicePTYs (qemu_process.c:1497)
==20624== by 0x498321: qemuProcessWaitForMonitor (qemu_process.c:1258)
==20624== by 0x49B5F9: qemuProcessStart (qemu_process.c:3652)
==20624== by 0x468B5C: qemuDomainObjStart (qemu_driver.c:4753)
==20624== by 0x469171: qemuDomainStartWithFlags (qemu_driver.c:4810)
==20624== by 0x4F21735: virDomainCreate (libvirt.c:8153)
==20624== by 0x4302BF: remoteDispatchDomainCreateHelper (remote_dispatch.h:852)
==20624== by 0x4F72C14: virNetServerProgramDispatch (virnetserverprogram.c:416)
==20624== by 0x4F6D690: virNetServerHandleJob (virnetserver.c:164)
==20624== by 0x4E8F43D: virThreadPoolWorker (threadpool.c:144)
==20624== by 0x4E8EAB5: virThreadHelper (threads-pthread.c:161)
==20624== by 0x38EC606CCA: start_thread (pthread_create.c:301)
==20624== by 0x38EC2E0C2C: clone (clone.S:115)
So that a domain xml which doesn't have "placement" specified, but
"cpuset" is specified, could be parsed. And in this case, the
"placement" mode will be set as "static".
As explained in previous patch, numad will balance the affinity
dynamically, so reflecting the cpuset from numad at the first
time doesn't make much case, and may just could cause confusion.
Although it should be harmless to do:
disk = disk = def->disks[i]
some not-so-wise compilers may fool around.
Besides, such assignment is useless here.
Detected by valgrind. Leaks are introduced in commit b22eaa7.
* src/conf/domain_conf.c (virDomainDiskDefParseXML): fix memory leaks.
How to reproduce?
% make && make -C tests check TESTS=qemuxml2argvtest
% cd tests && valgrind -v --leak-check=full ./qemuxml2argvtest
actual result:
==2143== 12 bytes in 2 blocks are definitely lost in loss record 74 of 179
==2143== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2143== by 0x39D90A67DD: xmlStrndup (xmlstring.c:45)
==2143== by 0x4F5EC0: virDomainDiskDefParseXML (domain_conf.c:3438)
==2143== by 0x502F00: virDomainDefParseXML (domain_conf.c:8304)
==2143== by 0x505FE3: virDomainDefParseNode (domain_conf.c:9080)
==2143== by 0x5069AE: virDomainDefParse (domain_conf.c:9030)
==2143== by 0x41CBF4: testCompareXMLToArgvHelper (qemuxml2argvtest.c:105)
==2143== by 0x41E5DD: virtTestRun (testutils.c:145)
==2143== by 0x416FA3: mymain (qemuxml2argvtest.c:399)
==2143== by 0x41DCB7: virtTestMain (testutils.c:700)
==2143== by 0x39CF01ECDC: (below main) (libc-start.c:226)
Signed-off-by: Alex Jia <ajia@redhat.com>
Since Xen 3.1 the clock=variable semantic is supported. In addition to
qemu/kvm Xen also knows about a variant where the offset is relative to
'localtime' instead of 'utc'.
Extends the libvirt structure with a flag 'basis' to specify, if the
offset is relative to 'localtime' or 'utc'.
Extends the libvirt structure with a flag 'reset' to force the reset
behaviour of 'localtime' and 'utc'; this is needed for backward
compatibility with previous versions of libvirt, since they report
incorrect XML.
Adapt the only user 'qemu' to the new name.
Extend the RelaxNG schema accordingly.
Document the new 'basis' attribute in the HTML documentation.
Adapt test for the new attribute.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Commit 1b1402b introduced a regression. Since older libvirt versions
would silently round memory up (until the previous patch), but populated
current memory based on querying the guest, it was possible to have
dumpxml show cur > max by the amount of the rounding. For example, if
a user requested 1048570 KiB memory (just shy of 1GiB), the qemu
driver would actually run with 1048576 KiB, and libvirt 0.9.10 would
output a current that was 6KiB larger than the maximum. Situations
where this could have an impact include, but are not limited to,
migration from old to new libvirt, managedsave in old libvirt and
start in new libvirt, snapshot creation in old libvirt and revert in
new libvirt - without this patch, the new libvirt would reject the
VM because of the rounding discrepancy.
Fix things by adding a fuzz factor, and silently clamp current down to
maximum in that case, rather than failing to reparse XML for an existing
VM. From a practical standpoint, this has no user impact: 'virsh
dumpxml' will continue to query the running guest rather than rely on
the incoming xml, which will see the currect current value, and even if
clamping down occurs during parsing, it will be by at most the fuzz
factor of a megabyte alignment, and rounded back up when passed back to
the hypervisor.
Meanwhile, we continue to reject cur > max if the difference is beyond
the fuzz factor of nearest megabyte. But this is not a real change in
behavior, since with 0.9.10, even though the parser allowed it, later
in the processing stream we would reject it at the qemu layer; so
rejecting it in the parser just moves error detection to a nicer place.
* src/conf/domain_conf.c (virDomainDefParseXML): Don't reject
existing XML.
Based on a report by Zhou Peng.
Regression introduced when we changed types in commit 3e2c3d8f6.
We've done this sort of cleanup before (see commit c685993d7).
* src/conf/storage_conf.c (virStoragePoolDefFormat)
(virStorageVolTargetDefFormat): Cast gid_t and uid_t.
* src/conf/domain_conf.c (virDomainChannelDefCheckABIStability): avoid
crashing libvirtd due to derefing a NULL pointer.
For details, please see bug:
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=808371
Signed-off-by: Alex Jia <ajia@redhat.com>
libvirt documentation for channels with type 'spicevmc' says that the
'target' child node has:
"an optional attribute name controls how the guest will have access
to the channel, and defaults to name='com.redhat.spice.0'."
However, this default value is never set in libvirt code base,
there's only a check in qemu_command.c to error out if the name
attribute doesn't have the expected value (if it's set).
This commit sets a default target name for spicevmc channels during
the domain configuration parsing so that the code agrees with the
documentation.
Pass argv to the init binary of LXC, using a new <initarg> element.
* docs/formatdomain.html.in: Document <os> usage for containers
* docs/schemas/domaincommon.rng: Add <initarg> element
* src/conf/domain_conf.c, src/conf/domain_conf.h: parsing and
formatting of <initarg>
* src/lxc/lxc_container.c: Setup LXC argv
* tests/Makefile.am, tests/lxcxml2xmldata/lxc-systemd.xml,
tests/lxcxml2xmltest.c, tests/testutilslxc.c,
tests/testutilslxc.h: Test parsing/formatting of LXC related
XML parts
Return statements with parameter enclosed in parentheses were modified
and parentheses were removed. The whole change was scripted, here is how:
List of files was obtained using this command:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$'
Found files were modified with this command:
sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
Then checked for nonsense.
The whole command looks like this:
git grep -l -e '\<return\s*([^()]*\(([^()]*)[^()]*\)*)\s*;' | \
grep -e '\.[ch]$' -e '\.py$' | xargs sed -i -e \
's_^\(.*\<return\)\s*(\(\([^()]*([^()]*)[^()]*\)*\))\s*\(;.*$\)_\1 \2\4_' \
-e 's_^\(.*\<return\)\s*(\([^()]*\))\s*\(;.*$\)_\1 \2\3_'
This introduces a new domain state pmsuspended to represent
the domain which has been suspended by guest power management,
e.g. (entered itno s3 state). Because a "running" state could
be confused in this case, one will see the guest is paused
actually while playing. And state "paused" is for the domain
which was paused by virDomainSuspend.
This patch introduces a new event type for the QMP event
SUSPEND:
VIR_DOMAIN_EVENT_ID_PMSUSPEND
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventSuspendCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
WAKEUP:
VIR_DOMAIN_EVENT_ID_PMWAKEUP
The event doesn't take any data, but considering there might
be reason for wakeup in future, the callback definition is:
typedef void
(*virConnectDomainEventWakeupCallback)(virConnectPtr conn,
virDomainPtr dom,
int reason,
void *opaque);
"reason" is unused currently, always passes "0".
This patch introduces a new event type for the QMP event
DEVICE_TRAY_MOVED, which occurs when the tray of a removable
disk is moved (i.e opened or closed):
VIR_DOMAIN_EVENT_ID_TRAY_CHANGE
The event's data includes the device alias and the reason
for tray status' changing, which indicates why the tray
status was changed. Thus the callback definition for the event
is:
enum {
VIR_DOMAIN_EVENT_TRAY_CHANGE_OPEN = 0,
VIR_DOMAIN_EVENT_TRAY_CHANGE_CLOSE,
\#ifdef VIR_ENUM_SENTINELS
VIR_DOMAIN_EVENT_TRAY_CHANGE_LAST
\#endif
} virDomainEventTrayChangeReason;
typedef void
(*virConnectDomainEventTrayChangeCallback)(virConnectPtr conn,
virDomainPtr dom,
const char *devAlias,
int reason,
void *opaque);
A few times libvirt users manually setting mac addresses have
complained of a networking failure that ends up being due to a multicast
mac address being used for a guest interface. This patch prevents that
by logging an error and failing if a multicast mac address is
encountered in each of the three following cases:
1) domain xml <interface> mac address.
2) network xml bridge mac address.
3) network xml dhcp/host mac address.
There are several other places where a mac address can be input that
aren't controlled in this manner because failure to do so has no
consequences (e.g., if the address will be used to search through
existing interfaces for a match).
The RNG has been updated to add multiMacAddr and uniMacAddr along with
the existing macAddr, and macAddr was switched to uniMacAddr where
appropriate.
If an error was encountered parsing a dhcp host entry mac address or
name, parsing would continue and log a less descriptive error that
might make it more difficult to notice the true nature of the problem.
This patch returns immediately on logging the first error.
If no <interface> elements are included in an LXC guest XML
description, then the LXC guest will just see the host's
network interfaces. It is desirable to be able to hide the
host interfaces, without having to define any guest interfaces.
This patch introduces a new feature flag <privnet/> to allow
forcing of a private network namespace for LXC. In the future
I also anticipate that we will add <privuser/> to force a
private user ID namespace.
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add support
for <privnet/> feature. Auto-set <privnet> if any <interface>
devices are defined
* src/lxc/lxc_container.c: Honour request for private network
namespace
numad is an user-level daemon that monitors NUMA topology and
processes resource consumption to facilitate good NUMA resource
alignment of applications/virtual machines to improve performance
and minimize cost of remote memory latencies. It provides a
pre-placement advisory interface, so significant processes can
be pre-bound to nodes with sufficient available resources.
More details: http://fedoraproject.org/wiki/Features/numad
"numad -w ncpus:memory_amount" is the advisory interface numad
provides currently.
This patch add the support by introducing a new XML attribute
for <vcpu>. e.g.
<vcpu placement="auto">4</vcpu>
<vcpu placement="static" cpuset="1-10^6">4</vcpu>
The returned advisory nodeset from numad will be printed
in domain's dumped XML. e.g.
<vcpu placement="auto" cpuset="1-10^6">4</vcpu>
If placement is "auto", the number of vcpus and the current
memory amount specified in domain XML will be used for numad
command line (numad uses MB for memory amount):
numad -w $num_of_vcpus:$current_memory_amount / 1024
The advisory nodeset returned from numad will be used to set
domain process CPU affinity then. (e.g. qemuProcessInitCpuAffinity).
If the user specifies both CPU affinity policy (e.g.
(<vcpu cpuset="1-10,^7,^8">4</vcpu>) and placement == "auto"
the specified CPU affinity will be overridden.
Only QEMU/KVM drivers support it now.
See docs update in patch for more details.
Even though we say in documentation setting (tls-)port to -1 is legacy
compat style for enabling autoport, we're roughly doing this for VNC.
However, in case of SPICE auto enable autoport iff both port & tlsPort
are equal -1 as documentation says autoport plays with both.
When host-model and host-passthrouh CPU modes were introduced, qemu
driver was properly modify to update guest CPU definition during
migration so that we use the right CPU at the destination. However,
similar treatment is needed for (managed)save and snapshots since they
need to save the exact CPU so that a domain can be properly restored.
To avoid repetition of such situation, all places that need live XML
share the code which generates it.
As a side effect, this patch fixes error reporting from
qemuDomainSnapshotWriteMetadata().
virNetworkDNSHostsDefParseXML was calling VIR_ALLOC(def->hosts) if
def->hosts was NULL. This is a waste of time, though, since
VIR_REALLOC_N is called a few lines further down, prior to any use of
def->hosts. (initializing def->nhosts to 0 is also redundant, because
the newly allocated memory will always be cleared to all 0's anyway).
There are several functions in domain_conf.c that remove a device
object from the domain's list of that object type, but don't free the
object or return it to the caller to free. In many cases this isn't a
problem because the caller already had a pointer to the object and
frees it afterward, but in several cases the removed object was just
left floating around with no references to it.
In particular, the function qemuDomainDetachDeviceConfig() calls
functions to locate and remove net (virDomainNetRemoveByMac), disk
(virDomainDiskRemoveByName()), and lease (virDomainLeaseRemove())
devices, but neither it nor its caller qemuDomainModifyDeviceConfig()
ever obtain a pointer to the device being removed, much less free it.
This patch modifies the following "remove" functions to return a
pointer to the device object being removed from the domain device
arrays, to give the caller the option of freeing the device object
using that pointer if needed. In places where the object was
previously leaked, it is now freed:
virDomainDiskRemove
virDomainDiskRemoveByName
virDomainNetRemove
virDomainNetRemoveByMac
virDomainHostdevRemove
virDomainLeaseRemove
virDomainLeaseRemoveAt
The functions that had been leaking:
libxlDomainDetachConfig - leaked a virDomainDiskDef
qemuDomainDetachDeviceConfig - could leak a virDomainDiskDef,
a virDomainNetDef, or a
virDomainLeaseDef
qemuDomainDetachLease - leaked a virDomainLeaseDef
Some members are generated during XML parse (e.g. MAC address of
an interface); However, with current implementation, if we
are plugging a device both to persistent and live config,
we parse given XML twice: first time for live, second for config.
This is wrong then as the second time we are not guaranteed
to generate same values as we did for the first time.
To prevent that we need to create a copy of DeviceDefPtr;
This is done through format/parse process instead of writing
functions for deep copy as it is easier to maintain:
adding new field to any virDomain*DefPtr doesn't require change
of copying function.
Output is still in kibibytes, but input can now be in different
scales for ease of typing.
* src/conf/domain_conf.c (virDomainParseMemory): New helper.
(virDomainDefParseXML): Use it when parsing.
* docs/schemas/domaincommon.rng: Expand XML; rename memoryKBElement
to memoryElement and update callers.
* docs/formatdomain.html.in (elementsMemoryAllocation): Document
scaling.
* tests/qemuxml2argvdata/qemuxml2argv-memtune.xml: Adjust test.
* tests/qemuxml2xmltest.c: Likewise.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-memtune.xml: New file.
Using 'unsigned long' for memory values is risky on 32-bit platforms,
as a PAE guest can have more than 4GiB memory. Our API is
(unfortunately) locked at 'unsigned long' and a scale of 1024, but
the rest of our system should consistently use 64-bit values,
especially since the previous patch centralized overflow checking.
* src/conf/domain_conf.h (_virDomainDef): Always use 64-bit values
for memory. Change hugepage_backed to a bool.
* src/conf/domain_conf.c (virDomainDefParseXML)
(virDomainDefCheckABIStability, virDomainDefFormatInternal): Fix
clients.
* src/vmx/vmx.c (virVMXFormatConfig): Likewise.
* src/xenxs/xen_sxpr.c (xenParseSxpr, xenFormatSxpr): Likewise.
* src/xenxs/xen_xm.c (xenXMConfigGetULongLong): New function.
(xenXMConfigGetULong, xenXMConfigSetInt): Avoid truncation.
(xenParseXM, xenFormatXM): Fix clients.
* src/phyp/phyp_driver.c (phypBuildLpar): Likewise.
* src/openvz/openvz_driver.c (openvzDomainSetMemoryInternal):
Likewise.
* src/vbox/vbox_tmpl.c (vboxDomainDefineXML): Likewise.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Likewise.
* src/qemu/qemu_process.c (qemuProcessStart): Likewise.
* src/qemu/qemu_monitor.h (qemuMonitorGetBalloonInfo): Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONGetBalloonInfo):
Likewise.
* src/qemu/qemu_driver.c (qemudDomainGetInfo)
(qemuDomainGetXMLDesc): Likewise.
* src/uml/uml_conf.c (umlBuildCommandLine): Likewise.
The test domain allows <memory>0</memory>, but the RNG was stating
that memory had to be at least 4096000 bytes. Hypervisors should
enforce their own limits, rather than complicating the RNG.
Meanwhile, some copy and paste had introduced some fishy constructs
in various unit tests.
* docs/schemas/domaincommon.rng (memoryKB, memoryKBElement): Drop
limit that isn't enforced in code.
* src/conf/domain_conf.c (virDomainDefParseXML): Require current
<= maximum.
* tests/qemuxml2argvdata/*.xml: Fix offenders.
Disk manufacturers are fond of quoting sizes in powers of 10,
rather than powers of 2 (after all, 2.1 GB sounds larger than
2.0 GiB, even though the exact opposite is true). So, we might
as well follow coreutils' lead in supporting three types of
suffix: single letter ${u} (which we already had) and ${u}iB
for the power of 2, and ${u}B for power of 10.
Additionally, it is impossible to create a file with more than
2**63 bytes, since off_t is signed (if you have enough storage
to even create one 8EiB file, I'm jealous). This now reports
failure up front rather than down the road when the kernel
finally refuses an impossible size.
* docs/schemas/basictypes.rng (unit): Add suffixes.
* src/conf/storage_conf.c (virStorageSize): Use new function.
* docs/formatstorage.html.in: Document it.
* tests/storagevolxml2xmlin/vol-file-backing.xml: Test it.
* tests/storagevolxml2xmlin/vol-file.xml: Likewise.
Make it obvious to 'dumpxml' readers what unit we are using,
since our default of KiB for memory (1024) differs from qemu's
default of MiB; and differs from our use of bytes for storage.
Tests were updated via:
$ find tests/*data tests/*out -name '*.xml' | \
xargs sed -i 's/<\(memory\|currentMemory\|hard_limit\|soft_limit\|min_guarantee\|swap_hard_limit\)>/<\1 unit='"'KiB'>/"
$ find tests/*data tests/*out -name '*.xml' | \
xargs sed -i 's/<\(capacity\|allocation\|available\)>/<\1 unit='"'bytes'>/"
followed by a few fixes for the stragglers.
Note that with this patch, the RNG for <memory> still forbids
validation of anything except unit='KiB', since the code silently
ignores the attribute; a later patch will expand <memory> to allow
scaled input in the code and update the RNG to match.
* docs/schemas/basictypes.rng (unit): Add 'bytes'.
(scaledInteger): New define.
* docs/schemas/storagevol.rng (sizing): Use it.
* docs/schemas/storagepool.rng (sizing): Likewise.
* docs/schemas/domaincommon.rng (memoryKBElement): New define; use
for memory elements.
* src/conf/storage_conf.c (virStoragePoolDefFormat)
(virStorageVolDefFormat): Likewise.
* src/conf/domain_conf.h (_virDomainDef): Document unit used
internally.
* src/conf/storage_conf.h (_virStoragePoolDef, _virStorageVolDef):
Likewise.
* tests/*data/*.xml: Update all tests.
* tests/*out/*.xml: Likewise.
* tests/define-dev-segfault: Likewise.
* tests/openvzutilstest.c (testReadNetworkConf): Likewise.
* tests/qemuargv2xmltest.c (blankProblemElements): Likewise.
This patch makes sure that each network device ("interface") of
type='hostdev' appears on both the hostdevs list and the nets list of
the virDomainDef, and it modifies the qemu driver startup code so that
these devices will be presented to qemu on the commandline as hostdevs
rather than as network devices.
It does not add support for hotplug of these type of devices, or code
to honor the <mac address> or <virtualport> given in the config (both
of those will be done in separate patches).
Once each device is placed on both lists, much of what this patch does
is modify places in the code that traverse all the device lists so
that these hybrid devices are only acted on once - either along with
the other hostdevs, or along with the other network interfaces. (In
many cases, only one of the lists is traversed / a specific operation
is performed on only one type of device. In those instances, the code
can remain unchanged.)
There is one special case - when building the commandline, interfaces
are allowed to proceed all the way through
networkAllocateActualDevice() before deciding to skip the rest of
netdev-specific processing - this is so that (once we have support for
networks with pools of hostdev devices) we can get the actual device
allocated, then rely on the loop processing all hostdevs to generate
the correct commandline.
(NB: <interface type='hostdev'> is only supported for PCI network
devices that are SR-IOV Virtual Functions (VF). Standard PCI[e] and
USB devices, and even the Physical Functions (PF) of SR-IOV devices
can only be assigned to a guest using the more basic <hostdev> device
entry. This limitation is mostly due to the fact that non-SR-IOV
ethernet devices tend to lose mac address configuration whenever the
card is reset, which happens when a card is assigned to a guest;
SR-IOV VFs fortunately don't suffer the same problem.)
This is the new interface type that sets up an SR-IOV PCI network
device to be assigned to the guest with PCI passthrough after
initializing some network device-specific things from the config
(e.g. MAC address, virtualport profile parameters). Here is an example
of the syntax:
<interface type='hostdev' managed='yes'>
<source>
<address type='pci' domain='0' bus='0' slot='4' function='3'/>
</source>
<mac address='00:11:22:33:44:55'/>
<address type='pci' domain='0' bus='0' slot='7' function='0'/>
</interface>
This would assign the PCI card from bus 0 slot 4 function 3 on the
host, to bus 0 slot 7 function 0 on the guest, but would first set the
MAC address of the card to 00:11:22:33:44:55.
NB: The parser and formatter don't care if the PCI card being
specified is a standard single function network adapter, or a virtual
function (VF) of an SR-IOV capable network adapter, but the upcoming
code that implements the back end of this config will work *only* with
SR-IOV VFs. This is because modifying the mac address of a standard
network adapter prior to assigning it to a guest is pointless - part
of the device reset that occurs during that process will reset the MAC
address to the value programmed into the card's firmware.
Although it's not supported by any of libvirt's hypervisor drivers,
usb network hostdevs are also supported in the parser and formatter
for completeness and consistency. <source> syntax is identical to that
for plain <hostdev> devices, except that the <address> element should
have "type='usb'" added if bus/device are specified:
<interface type='hostdev'>
<source>
<address type='usb' bus='0' device='4'/>
</source>
<mac address='00:11:22:33:44:55'/>
</interface>
If the vendor/product form of usb specification is used, type='usb'
is implied:
<interface type='hostdev'>
<source>
<vendor id='0x0012'/>
<product id='0x24dd'/>
</source>
<mac address='00:11:22:33:44:55'/>
</interface>
Again, the upcoming patch to fill in the backend of this functionality
will log an error and fail with "Unsupported Config" if you actually
try to assign a USB network adapter to a guest using <interface
type='hostdev'> - just use a standard <hostdev> entry in that case
(and also for single-port PCI adapters).
Three new functions useful in other files:
virDomainHostdevInsert:
Add a new hostdev at the end of the array. This would more sensibly be
called virDomainHostdevAppend, but the existing functions for other
types of devices are called Insert.
virDomainHostdevRemove:
Eliminates one entry from the hostdevs array, but doesn't free it;
patterned after the code at the end of the two
qemuDomainDetachHostXXXDevice functions (and also other pre-existing
virDomainXXXRemove functions for other device types).
virDomainHostdevFind:
This function is patterned from the search loops at the top of
qemuDomainDetachHostPciDevice and qemuDomainDetachHostUsbDevice, and
will be used to re-factor those (and other detach-related) functions.
To shorten some new code that accesses the many fields within the
subsys struct of a hostdev, create a separate toplevel, typedefed
virDomainHostdevSubsys struct so that we can define temporary pointers
to the subsys part.
The parent can be any type of device. It defaults to type=none, and a
NULL pointer. The intent is that if a hostdevdef is contained in the
def for a higher level device (e.g. virDomainNetDef), hostdev->parent
will point to the higher level device, and type will be set to that
type of device. This way, during attach and detach of the device,
parent can be checked, and appropriate callouts made to do higher
level device initialization (e.g. setting MAC address).
Also, although these hostdevs with parents will be added to a domain's
hostdevs list, they will be treated slightly differently when
traversing the list, e.g. virDomainHostdefDefFree for a hostdev that
has a parent doesn't need to be called (and will be a NOP); it will
simply be removed from the list (since the parent device object is in
its own type-specific list, and will be freed from there).
In an upcoming patch, virDomainNetDef will acquire a
virDomainHostdevDef, and the <interface> XML will take on some of the
elements of a <hostdev>. To avoid duplicating the code for parsing and
formatting the <source> element (which will be nearly identical in
these two cases), this patch factors those parts out of the
HostdevDef's parse and format functions, and puts them into separate
helper functions that are now called by the HostdevDef
parser/formatter, and will soon be called by the NetDef
parser/formatter.
One change in behavior - previously virDomainHostdevDefParseXML() had
diverged from current common coding practice by logging an error and
failing if it found any subelements of <hostdev> other than those it
understood (standard libvirt practice is to ignore/discard unknown
elements and attributes during parse). The new helper function ignores
unknown elements, and thus so does the new
virDomainHostdevDefParseXML.
In order to allow for a virDomainHostdevDef that uses the
virDomainDeviceInfo of a "higher level" device (such as a
virDomainNetDef), this patch changes the virDomainDeviceInfo in the
HostdevDef into a virDomainDeviceInfoPtr. Rather than adding checks
all over the code to check for a null info, we just guarantee that it
is always valid. The new function virDomainHostdevDefAlloc() allocates
a virDomainDeviceInfo and plugs it in, and virDomainHostdevDefFree()
makes sure it is freed.
There were 4 places allocating virDomainHostdevDefs, all of them
parsers of one sort or another, and those have all had their
VIR_ALLOC(hostdev) changed to virDomainHostdevDefAlloc(). Other than
that, and the new functions, all the rest of the changes are just
mechanical removals of "&" or changing "." to "->".
There will be cases where the iterator callback will need to know the
type of the device whose info is being operated on, and possibly even
need to use some of the device's config. This patch adds a
virDomainDeviceDefPtr to the args of every callback, and fills it in
appropriately as the devices are iterated through.
This patch is only code movement + adding some forward definitions of
typedefs.
virDomainHostdevDef (not just a pointer to it, but an actual object)
will be needed in virDomainNetDef and virDomainActualNetDef, so it
must be relocated earlier in the file.
Likewise, virDomainDeviceDef will be needed in virDomainHostdevDef, so
it must be moved up even earlier. This, in turn, creates a forward
reference problem, but fortunately only with pointers to other device
types, so their typedefs can be moved up in the file, eliminating the
problem.
Not all device types were represented in virDomainDeviceType, so some
types of devices couldn't be represented in a virDomainDeviceDef
(which requires a different type of pointer in the union for each
different kind of device).
Since serial, parallel, channel, and console devices are all
virDomainChrDef, and the virDomainDeviceType is never used to produce
a string from the type (and only used in the other direction
internally to code, never to produce XML), I only added one "CHR"
type, which is associated with "virDomainChrDefPtr chr" in the union.
No thanks to 64-bit windows, with 64-bit pid_t, we have to avoid
constructs like 'int pid'. Our API in libvirt-qemu cannot be
changed without breaking ABI; but then again, libvirt-qemu can
only be used on systems that support UNIX sockets, which rules
out Windows (even if qemu could be compiled there) - so for all
points on the call chain that interact with this API decision,
we require a different variable name to make it clear that we
audited the use for safety.
Adding a syntax-check rule only solves half the battle; anywhere
that uses printf on a pid_t still needs to be converted, but that
will be a separate patch.
* cfg.mk (sc_correct_id_types): New syntax check.
* src/libvirt-qemu.c (virDomainQemuAttach): Document why we didn't
use pid_t for pid, and validate for overflow.
* include/libvirt/libvirt-qemu.h (virDomainQemuAttach): Tweak name
for syntax check.
* src/vmware/vmware_conf.c (vmwareExtractPid): Likewise.
* src/driver.h (virDrvDomainQemuAttach): Likewise.
* tools/virsh.c (cmdQemuAttach): Likewise.
* src/remote/qemu_protocol.x (qemu_domain_attach_args): Likewise.
* src/qemu_protocol-structs (qemu_domain_attach_args): Likewise.
* src/util/cgroup.c (virCgroupPidCode, virCgroupKillInternal):
Likewise.
* src/qemu/qemu_command.c(qemuParseProcFileStrings): Likewise.
(qemuParseCommandLinePid): Use pid_t for pid.
* daemon/libvirtd.c (daemonForkIntoBackground): Likewise.
* src/conf/domain_conf.h (_virDomainObj): Likewise.
* src/probes.d (rpc_socket_new): Likewise.
* src/qemu/qemu_command.h (qemuParseCommandLinePid): Likewise.
* src/qemu/qemu_driver.c (qemudGetProcessInfo, qemuDomainAttach):
Likewise.
* src/qemu/qemu_process.c (qemuProcessAttach): Likewise.
* src/qemu/qemu_process.h (qemuProcessAttach): Likewise.
* src/uml/uml_driver.c (umlGetProcessInfo): Likewise.
* src/util/virnetdev.h (virNetDevSetNamespace): Likewise.
* src/util/virnetdev.c (virNetDevSetNamespace): Likewise.
* tests/testutils.c (virtTestCaptureProgramOutput): Likewise.
* src/conf/storage_conf.h (_virStoragePerms): Use mode_t, uid_t,
and gid_t rather than int.
* src/security/security_dac.c (virSecurityDACSetOwnership): Likewise.
* src/conf/storage_conf.c (virStorageDefParsePerms): Avoid
compiler warning.
* src/conf/domain_conf.h: Add new member "target" to struct
_virDomainDeviceDriveAddress.
* src/conf/domain_conf.c: Parse and format "target"
* Lots of tests (.xml) in tests/domainsnapshotxml2xmlout,
tests/qemuxml2argvdata, tests/qemuxml2xmloutdata, and
tests/vmx2xmldata/ are modified for newly introduced
attribute "target" for address of "drive" type.
KVM will be able to use a PCI SCSI controller even on POWER. Let
the user specify the vSCSI controller by other means than a default.
After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Osier Yang <jyang@redhat.com>
This patch adds a set of functions used in creating console streams for
domains using PTYs and ensures mutually exclusive access to the PTYs.
If mutually exclusive access is not used, two clients may open the same
console, which results in corruption on both clients as both of them
race to read data from the PTY.
Two approaches are used to ensure this:
1) Internal data structure holding open PTYs.
This is used internally and enables the user to forcibly
terminate another console connection eg. when somebody leaves
the console open on another host.
2) UUCP style lock files:
This uses UUCP lock files according to the FHS
( http://www.pathname.com/fhs/pub/fhs-2.3.html#VARLOCKLOCKFILES )
to check if other programs (like minicom) are not using the pty
device of the console.
This feature is disabled by default and may be enabled using
configure parameter
--with-console-lock-files=/path/to/lock/file/directory
or --with-console-lock-files=auto (which tries to infer the
location from OS used (currently only linux).
On usual linux systems, normal users may not write to the
/var/lock directory containing the locks. This poses problems
while in session mode. If the current user has no access to the
lockfile directory, check for presence of the file is still
done, but no lock file is created. This does NOT result in an
error.
Previously we would have:
"os type 'hvm' & arch 'idontexist' combination is not supported"
Now we get
"No guest options available for arch 'idontexist'"
or if options available but guest OS type not applicable:
"No os type 'xen' available for arch 'x86_64'"
Bug introduced in commit 35abced. On an inactive domain,
$ virsh snapshot-create-as dom snap
$ virsh snapshot-create dom
$ virsh snapshot-create dom
$ virsh snapshot-delete --children dom snap
could crash libvirtd, due to a use-after-free that results
when the callback freed the current element in the iteration.
* src/conf/domain_conf.c (virDomainSnapshotForEachChild)
(virDomainSnapshotActOnDescendant): Allow iteration to delete
current child.
This patch allows libvirt to add interfaces to already
existing Open vSwitch bridges. The following syntax in
domain XML file can be used:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'/>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
or if libvirt should auto-generate the interfaceid use
following syntax:
<interface type='bridge'>
<mac address='52:54:00:d0:3f:f2'/>
<source bridge='ovsbr'/>
<virtualport type='openvswitch'>
</virtualport>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x03' function='0x0'/>
</interface>
It is also possible to pass an optional profileid. To do that
use following syntax:
<interface type='bridge'>
<source bridge='ovsbr'/>
<mac address='00:55:1a:65:a2:8d'/>
<virtualport type='openvswitch'>
<parameters interfaceid='921a80cd-e6de-5a2e-db9c-ab27f15a6e1d'
profileid='test-profile'/>
</virtualport>
</interface>
To create Open vSwitch bridge install Open vSwitch and
run the following command:
ovs-vsctl add-br ovsbr
The auto-generated WWN comply with the new addressing schema of WWN:
<quote>
the first nibble is either hex 5 or 6 followed by a 3-byte vendor
identifier and 36 bits for a vendor-specified serial number.
</quote>
We choose hex 5 for the first nibble. And for the 3-bytes vendor ID,
we uses the OUI according to underlying hypervisor type, (invoking
virConnectGetType to get the virt type). e.g. If virConnectGetType
returns "QEMU", we use Qumranet's OUI (00:1A:4A), if returns
ESX|VMWARE, we use VMWARE's OUI (00:05:69). Currently it only
supports qemu|xen|libxl|xenapi|hyperv|esx|vmware drivers. The last
36 bits are auto-generated.
Some audit records generated by libvirt contain fields enclosed by single
quotes. Since those fields are inside the msg field, which is enclosed by
single quotes, these records generated by libvirt are not correctly parsed by
libauparse.
Some tools, such as virt-manager, prefers having the default USB
controller explicit in the XML document. This patch makes sure there
is one. With this patch, it is now possible to switch from USB1 to
USB2 from the release 0.9.1 of virt-manager.
Fix tests to pass with this change.
Security label type 'none' requires relabel to be set to 'no' so there's
no reason to output this extra attribute. Moreover, since relabel is
internally stored in a negative from (norelabel), the default value for
relabel would be 'yes' in case there is no <seclabel> element in domain
configuration. In case VIR_DOMAIN_SECLABEL_DEFAULT turns into
VIR_DOMAIN_SECLABEL_NONE, we would incorrectly output relabel='yes' for
seclabel type 'none'.
Commit b170eb99 introduced a bug: domains that had an explicit
<seclabel type='none'/> when started would not be reparsed if
libvirtd restarted. It turns out that our testsuite was not
exercising this because it never tried anything but inactive
parsing. Additionally, the live XML for such a domain failed
to re-validate. Applying just the tests/ portion of this patch
will expose the bugs that are fixed by the other two files.
* docs/schemas/domaincommon.rng (seclabel): Allow relabel under
type='none'.
* src/conf/domain_conf.c (virSecurityLabelDefParseXML): Per RNG,
presence of <seclabel> with no type implies dynamic. Don't
require sub-elements for type='none'.
* tests/qemuxml2xmltest.c (mymain): Add test.
* tests/qemuxml2argvtest.c (mymain): Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-none.xml: Add file.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-none.args: Add file.
Reported by Ansis Atteka.
This eliminates the warning message reported in:
https://bugzilla.redhat.com/show_bug.cgi?id=624447
It was caused by a failure to open an image file that is not
accessible by root (the uid libvirtd is running as) because it's on a
root-squash NFS share, owned by a different user, with permissions of
660 (or maybe 600).
The solution is to use virFileOpenAs() rather than open(). The
codepath that generates the error is during qemuSetupDiskCGroup(), but
the actual open() is in a lower-level generic function called from
many places (virDomainDiskDefForeachPath), so some other pieces of the
code were touched just to add dummy (or possibly useful) uid and gid
arguments.
Eliminating this warning message has the nice side effect that the
requested operation may even succeed (which in this case isn't
necessary, but shouldn't hurt anything either).
Our HACKING discourages use of malloc and free, for at least
a couple of years now. But we weren't enforcing it, until now :)
For now, I've exempted python and tests, and will clean those up
in subsequent patches. Examples should be permanently exempt,
since anyone copying our examples won't have use of our
internal-only memory.h via libvirt_util.la.
* cfg.mk (sc_prohibit_raw_allocation): New rule.
(exclude_file_name_regexp--sc_prohibit_raw_allocation): and
exemptions.
* src/cpu/cpu.c (cpuDataFree): Avoid false positive.
* src/conf/network_conf.c (virNetworkDNSSrvDefParseXML): Fix
offenders.
* src/libxl/libxl_conf.c (libxlMakeDomBuildInfo, libxlMakeVfb)
(libxlMakeDeviceModelInfo): Likewise.
* src/rpc/virnetmessage.c (virNetMessageSaveError): Likewise.
* tools/virsh.c (_vshMalloc, _vshCalloc): Likewise.
Detected by valgrind. Leak is introduced in commit 397e6a7.
* src/conf/domain_conf.c(virDomainDiskDefParseXML): fix memory leak.
How to reproduce?
% make -C tests check TESTS=qemuxml2argvtest
% cd tests && valgrind -v --leak-check=full ./qemuxml2argvtest
* Actual result:
==16352== 4 bytes in 1 blocks are definitely lost in loss record 12 of 147
==16352== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==16352== by 0x39D90A67DD: xmlStrndup (xmlstring.c:45)
==16352== by 0x4E83D5: virDomainDiskDefParseXML (domain_conf.c:2894)
==16352== by 0x4F542D: virDomainDefParseXML (domain_conf.c:7626)
==16352== by 0x4F8683: virDomainDefParseNode (domain_conf.c:8390)
==16352== by 0x4F904E: virDomainDefParse (domain_conf.c:8340)
==16352== by 0x41C626: testCompareXMLToArgvHelper (qemuxml2argvtest.c:105)
==16352== by 0x41DED1: virtTestRun (testutils.c:142)
==16352== by 0x418172: mymain (qemuxml2argvtest.c:486)
==16352== by 0x41D5C7: virtTestMain (testutils.c:697)
==16352== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
Signed-off-by: Alex Jia <ajia@redhat.com>
Curently security labels can be of type 'dynamic' or 'static'.
If no security label is given, then 'dynamic' is assumed. The
current code takes advantage of this default, and avoids even
saving <seclabel> elements with type='dynamic' to disk. This
means if you temporarily change security driver, the guests
can all still start.
With the introduction of sVirt to LXC though, there needs to be
a new default of 'none' to allow unconfined LXC containers.
This patch introduces two new security label types
- default: the host configuration decides whether to run the
guest with type 'none' or 'dynamic' at guest start
- none: the guest will run unconfined by security policy
The 'none' label type will obviously be undesirable for some
deployments, so a new qemu.conf option allows a host admin to
mandate confined guests. It is also possible to turn off default
confinement
security_default_confined = 1|0 (default == 1)
security_require_confined = 1|0 (default == 0)
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add new
seclabel types
* src/security/security_manager.c, src/security/security_manager.h:
Set default sec label types
* src/security/security_selinux.c: Handle 'none' seclabel type
* src/qemu/qemu.conf, src/qemu/qemu_conf.c, src/qemu/qemu_conf.h,
src/qemu/libvirtd_qemu.aug: New security config options
* src/qemu/qemu_driver.c: Tell security driver about default
config
This re-introduces parsing & formatting for per device seclabels.
There is a new virDomainDeviceSeclabelPtr struct and corresponding
APIs for parsing/formatting.
Revert parsing changes:
commit 302fe95ffa
Author: Eric Blake <eblake@redhat.com>
Date: Wed Jan 4 16:01:24 2012 -0700
seclabel: fix regression in libvirtd restart
commit b43432931a
Author: Eric Blake <eblake@redhat.com>
Date: Thu Dec 22 17:47:50 2011 -0700
seclabel: allow a seclabel override on a disk src
These two commits changed the sec label parsing code so that
the same code dealt with both the VM level sec label, and the
per device label. Unfortunately, as we add more options to the
VM level sec label, the logic required to use the same parsing
code for the per device label becomes unintelligible.
* src/conf/domain_conf.c: Remove support for parsing per
device sec labels
This patch adds a new element <title> to the domain XML. This attribute
can hold a short title defined by the user to ease the identification of
domains. The title may not contain newlines and should be reasonably short.
*docs/formatdomain.html.in
*docs/schemas/domaincommon.rng
- add schema grammar for the new element and documentation
*src/conf/domain_conf.c
*src/conf/domain_conf.h
- add field to hold the new attribute
- add code to parse and create XML with the new attribute
This patch adds a new attribute "rawio" to the "disk" element
of domain XML. Valid values of "rawio" attribute are "yes"
and "no".
rawio='yes' indicates the disk is desirous of CAP_SYS_RAWIO.
If you specify the following XML:
<disk type='block' device='lun' rawio='yes'>
...
</disk>
the domain will be granted CAP_SYS_RAWIO.
(of course, the domain have to be executed with root privilege)
NOTE:
- "rawio" attribute is only valid when device='lun'
- At the moment, any other disks you won't use rawio can use rawio.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
This patch addresses: https://bugzilla.redhat.com/show_bug.cgi?id=781562
Along with the "rombar" option that controls whether or not a boot rom
is made visible to the guest, qemu also has a "romfile" option that
allows specifying a binary file to present as the ROM BIOS of any
emulated or passthrough PCI device. This patch adds support for
specifying romfile to both passthrough PCI devices, and emulated
network devices that attach to the guest's PCI bus (just about
everything other than ne2k_isa).
One example of the usefulness of this option is described in the
bugzilla report: 82576 sriov network adapters don't provide a ROM BIOS
for the cards virtual functions (VF), but an image of such a ROM is
available, and with this ROM visible to the guest, it can PXE boot.
In libvirt's xml, the new option is configured like this:
<hostdev>
...
<rom file='/etc/fake/boot.bin'/>
...
</hostdev
(similarly for <interface>).
When support for the rombar option was added, it was only added for
PCI passthrough devices, configured with <hostdev>. The same option is
available for any network device that is attached to the guest's PCI
bus. This patch allows setting rombar for any PCI network device type.
After adding cases to test this to qemuxml2argv-hostdev-pci-rombar.*,
I decided to rename those files (to qemuxml2argv-pci-rom.*) to more
accurately reflect the additional tests, and also noticed that up to
now we've only been performing a domainschematest for that case, so I
added the "pci-rom" test to both qemuxml2argv and qemuxml2xml (and in
the process found some bugs whose fixes I squashed into previous
commits of this series).
Since these two items are now in the virDomainDeviceInfo struct, it
makes sense to parse/format them in the functions written to
parse/format that structure. Not all types of devices allow them, so
two internal flags are added to indicate when it is appropriate to do
so.
I was lucky - only one test case needed to be re-ordered!
To help consolidate the commonality between virDomainHostdevDef and
virDomainNetDef into as few members as possible (and because I
think it makes sense), this patch moves the rombar and bootIndex
members into the "info" member that is common to both (and to all the
other structs that use them).
It's a bit problematic that this gives rombar and bootIndex to many
device types that don't use them, but this is already the case for the
master and mastertype members of virDomainDeviceInfo, and is properly
commented as such in the definition.
Note that this opens the door to supporting rombar for other devices
that are attached to the guest PCI bus - virtio-blk-pci,
virtio-net-pci, various other network adapters - which which have that
capability in qemu, but previously had no support in libvirt.
Add kvmclock timer to documentation, schema and parsers. Keep the
platform timer first since it is kind of special, and alphabetize
the others when possible (i.e. when it does not change the ABI).
Reviewed-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Compare two filters' XML for equality and only rebuild/instantiate the new
filter if the new and current filters are found to be different. This
improves performance during an update of a filter with no obvious change
or the reloading of filters during a 'kill -SIGHUP'
Introduce a function that rebuilds all running VMs' filters. Call
this function when reloading the nwfilter driver.
This addresses a problem introduced by the 2nd patch that typically
causes no filters to be reinstantiate anymore upon driver reload
since their XML has not changed. Yet the current behavior is that
upon a SIGHUP all filters get reinstantiated.
Added a new field "vm-pid" to the VIRT_CONTROL audit record. This information
is useful to correlated another audit events to the events generated by
libvirt.
In preparation for the patch to include Murmurhash3, which
introduces a virhashcode.h and virhashcode.c files, rename
the existing hash.h and hash.c to virhash.h and virhash.c
respectively.
It's better to group all the metadata together. This is a
cosmetic output change; since the RNG allows interleave, it
doesn't matter where the user stuck it on input, and an XPath
query will find the same information when parsing the output.
* src/conf/domain_conf.c (virDomainDefFormatInternal): Output
metadata earlier.
* docs/formatdomain.html.in: Update documentation.
* tests/domainsnapshotxml2xmlout/metadata.xml: Update test.
* tests/qemuxml2xmloutdata/qemuxml2xmlout-metadata.xml: Likewise.
Applications can now insert custom nodes and hierarchies into domain
configuration XML. Although currently not enforced, applications are
required to use their own namespaces on every custom node they insert,
with only one top-level element per namespace.
When converting a linear enum to a string, we have checks in
place in the VIR_ENUM_IMPL macro to ensure that there is one
string for every value, which lets us quickly flag if a user
added a value but forgot to add a counterpart string. However,
this only works if we use the _LAST marker.
* cfg.mk (sc_require_enum_last_marker): New syntax check.
* src/conf/domain_conf.h (virDomainSnapshotState): Add new marker.
* src/conf/domain_conf.c (virDomainSnapshotState): Fix offender.
* src/qemu/qemu_monitor_json.c (qemuMonitorWatchdogAction)
(qemuMonitorIOErrorAction, qemuMonitorGraphicsAddressFamily):
Likewise.
* src/util/virtypedparam.c (virTypedParameter): Likewise.
This introduces new attribute wrpolicy with only supported
value as immediate. This will be an optional
attribute with no defaults. This helps specify whether
to skip the host page cache.
When wrpolicy is specified, meaning when wrpolicy=immediate
a writeback is explicitly initiated for the dirty pages in
the host page cache as part of the guest file write operation.
Usage:
<filesystem type='mount' accessmode='passthrough'>
<driver type='path' wrpolicy='immediate'/>
<source dir='/export/to/guest'/>
<target dir='mount_tag'/>
</filesystem>
Currently this only works with type='mount' for the QEMU/KVM driver.
Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
There are several reasons for doing this:
- the CPU specification is out of libvirt's control so we cannot
guarantee stable guest ABI
- not every feature of a CPU may actually work as expected when
advertised directly to a guest
- migration between two machines with exactly the same CPU may work but
no guarantees can be made
- this mode is not supported and its use is at one's own risk
VIR_DOMAIN_XML_UPDATE_CPU flag for virDomainGetXMLDesc may be used to
get updated custom mode guest CPU definition in case it depends on host
CPU. This patch implements the same behavior for host-model and
host-passthrough CPU modes.
The mode can be either of "custom" (default), "host-model",
"host-passthrough". The semantics of each mode is described in the
following examples:
- guest CPU is a default model with specified topology:
<cpu>
<topology sockets='1' cores='2' threads='1'/>
</cpu>
- guest CPU matches selected model:
<cpu mode='custom' match='exact'>
<model>core2duo</model>
</cpu>
- guest CPU should be a copy of host CPU as advertised by capabilities
XML (this is a short cut for manually copying host CPU specification
from capabilities to domain XML):
<cpu mode='host-model'/>
In case a hypervisor does not support the exact host model, libvirt
automatically falls back to a closest supported CPU model and
removes/adds features to match host. This behavior can be disabled by
<cpu mode='host-model'>
<model fallback='forbid'/>
</cpu>
- the same as previous returned by virDomainGetXMLDesc with
VIR_DOMAIN_XML_UPDATE_CPU flag:
<cpu mode='host-model' match='exact'>
<model fallback='allow'>Penryn</model> --+
<vendor>Intel</vendor> |
<topology sockets='2' cores='4' threads='1'/> + copied from
<feature policy='require' name='dca'/> | capabilities XML
<feature policy='require' name='xtpr'/> |
... --+
</cpu>
- guest CPU should be exactly the same as host CPU even in the aspects
libvirt doesn't model (such domain cannot be migrated unless both
hosts contain exactly the same CPUs):
<cpu mode='host-passthrough'/>
- the same as previous returned by virDomainGetXMLDesc with
VIR_DOMAIN_XML_UPDATE_CPU flag:
<cpu mode='host-passthrough' match='minimal'>
<model>Penryn</model> --+ copied from caps
<vendor>Intel</vendor> | XML but doesn't
<topology sockets='2' cores='4' threads='1'/> | describe all
<feature policy='require' name='dca'/> | aspects of the
<feature policy='require' name='xtpr'/> | actual guest CPU
... --+
</cpu>
In case a hypervisor doesn't support the exact CPU model requested by a
domain XML, we automatically fallback to a closest CPU model the
hypervisor supports (and make sure we add/remove any additional features
if needed). This patch adds 'fallback' attribute to model element, which
can be used to disable this automatic fallback.
There are three address validation routines that do nothing:
virDomainDeviceDriveAddressIsValid()
virDomainDeviceUSBAddressIsValid()
virDomainDeviceVirtioSerialAddressIsValid()
Remove them, and replace their call sites with "1" which is what they
currently return. In some cases this means we can remove an entire
if block.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
KVM will be able to use a PCI SCSI controller even on POWER. Let
the user specify the vSCSI controller by other means than a default.
After this patch, the QEMU driver will actually look at the model
and reject anything but auto, lsilogic and ibmvscsi.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit d09f6ba5fe introduced a regression in event
registration. virDomainEventCallbackListAddID() will only return a positive
integer if the type of event being registered is VIR_DOMAIN_EVENT_ID_LIFECYCLE.
For other event types, 0 is always returned on success. This has the
unfortunate side effect of not enabling remote event callbacks because
remoteDomainEventRegisterAny() uses the return value from the local call to
determine if an event callback needs to be registered on the remote end.
Make sure virDomainEventCallbackListAddID() returns the callback count for the
eventID being registered.
Signed-off-by: Adam Litke <agl@us.ibm.com>
The new introduced optional attribute "copy_on_read</code> controls
whether to copy read backing file into the image file. The value can
be either "on" or "off". Copy-on-read avoids accessing the same backing
file sectors repeatedly and is useful when the backing file is over a
slow network. By default copy-on-read is off.
Earlier, when the number of vcpus was greater than the topology allowed,
libvirt didn't raise an error and continued, resulting in running qemu
with parameters making no sense. Even though qemu did not report any
error itself, the number of vcpus was set to maximum allowed by the
topology.
For some weird reason, i686-pc-mingw32-gcc version 4.6.1 at -O2 complained:
../../src/conf/nwfilter_params.c: In function 'virNWFilterVarCombIterCreate':
../../src/conf/nwfilter_params.c:346:23: error: 'minValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:28: note: 'minValue' was declared here
../../src/conf/nwfilter_params.c:344:23: error: 'maxValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:18: note: 'maxValue' was declared here
cc1: all warnings being treated as errors
even though all paths of the preceding switch statement either
assign the variables or return.
* src/conf/nwfilter_params.c (virNWFilterVarCombIterAddVariable):
Initialize variables.
Address side effect of accessing a variable via an index: Filters
accessing a variable where an element is accessed that is beyond the
size of the list (for example $TEST[10] and only 2 elements are available)
cannot instantiate that filter. Test for this and report proper error
to user.
This patch adds access to single elements of variables via index. Example:
<rule action='accept' direction='in' priority='500'>
<tcp srcipaddr='$ADDR[1]' srcportstart='$B[2]'/>
</rule>
This patch introduces the capability to use a different iterator per
variable.
The currently supported notation of variables in a filtering rule like
<rule action='accept' direction='out'>
<tcp srcipaddr='$A' srcportstart='$B'/>
</rule>
processes the two lists 'A' and 'B' in parallel. This means that A and B
must have the same number of 'N' elements and that 'N' rules will be
instantiated (assuming all tuples from A and B are unique).
In this patch we now introduce the assignment of variables to different
iterators. Therefore a rule like
<rule action='accept' direction='out'>
<tcp srcipaddr='$A[@1]' srcportstart='$B[@2]'/>
</rule>
will now create every combination of elements in A with elements in B since
A has been assigned to an iterator with Id '1' and B has been assigned to an
iterator with Id '2', thus processing their value independently.
The first rule has an equivalent notation of
<rule action='accept' direction='out'>
<tcp srcipaddr='$A[@0]' srcportstart='$B[@0]'/>
</rule>
In this patch we introduce testing whether the iterator points to a
unique set of entries that have not been seen before at one of the previous
iterations. The point is to eliminate duplicates and with that unnecessary
filtering rules by preventing identical filtering rules from being
instantiated.
Example with two lists:
list1 = [1,2,1]
list2 = [1,3,1]
The 1st iteration would take the 1st items of each list -> 1,1
The 2nd iteration would take the 2nd items of each list -> 2,3
The 3rd iteration would take the 3rd items of each list -> 1,1 but
skip them since this same pair has already been encountered in the 1st
iteration
Implementation-wise this is solved by taking the n-th element of list1 and
comparing it against elements 1..n-1. If no equivalent is found, then there
is no possibility of this being a duplicate. In case an equivalent element
is found at position i, then the n-th element in the 2nd list is compared
against the i-th element in the 2nd list and if that is not the same, then
this is a unique pair, otherwise it is not unique and we may need to do
the same comparison on the 3rd list.
In the past, generic SCSI commands issued from a guest to a virtio
disk were always passed through to the underlying disk by qemu, and
the kernel would also pass them on.
As a result of CVE-2011-4127 (see:
http://seclists.org/oss-sec/2011/q4/536), qemu now honors its
scsi=on|off device option for virtio-blk-pci (which enables/disables
passthrough of generic SCSI commands), and the kernel will only allow
the commands for physical devices (not for partitions or logical
volumes). The default behavior of qemu is still to allow sending
generic SCSI commands to physical disks that are presented to a guest
as virtio-blk-pci devices, but libvirt prefers to disable those
commands in the standard virtio block devices, enabling it only when
specifically requested (hopefully indicating that the requester
understands what they're asking for). For this purpose, a new libvirt
disk device type (device='lun') has been created.
device='lun' is identical to the default device='disk', except that:
1) It is only allowed if bus='virtio', type='block', and the qemu
version is "new enough" to support it ("new enough" == qemu 0.11 or
better), otherwise the domain will fail to start and a
CONFIG_UNSUPPORTED error will be logged).
2) The option "scsi=on" will be added to the -device arg to allow
SG_IO commands (if device !='lun', "scsi=off" will be added to the
-device arg so that SG_IO commands are specifically forbidden).
Guests which continue to use disk device='disk' (the default) will no
longer be able to use SG_IO commands on the disk; those that have
their disk device changed to device='lun' will still be able to use SG_IO
commands.
*docs/formatdomain.html.in - document the new device attribute value.
*docs/schemas/domaincommon.rng - allow it in the RNG
*tests/* - update the args of several existing tests to add scsi=off, and
add one new test that will test scsi=on.
*src/conf/domain_conf.c - update domain XML parser and formatter
*src/qemu/qemu_(command|driver|hotplug).c - treat
VIR_DOMAIN_DISK_DEVICE_LUN *almost* identically to
VIR_DOMAIN_DISK_DEVICE_DISK, except as indicated above.
Note that no support for this new device value was added to any
hypervisor drivers other than qemu, because it's unclear what it might
mean (if anything) to those drivers.
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=638633
Although scripts are not used by interfaces of type other than
"ethernet" in qemu, due to the fact that the parser stores the script
name in a union that is only valid when type is ethernet or bridge,
there is no way for anyone except the parser itself to catch the
problem of specifying an interface script for an inappropriate
interface type (by the time the parsed data gets back to the code that
called the parser, all evidence that a script was specified is
forgotten).
Since the parser itself should be agnostic to which type of interface
allows scripts (an example of why: a script specified for an interface
of type bridge is valid for xen domains, but not for qemu domains),
the solution here is to move the script out of the union(s) in the
DomainNetDef, always populate it when specified (regardless of
interface type), and let the driver decide whether or not it is
appropriate.
Currently the qemu, xen, libxml, and uml drivers recognize the script
parameter and do something with it (the uml driver only to report that
it isn't supported). Those drivers have been updated to log a
CONFIG_UNSUPPORTED error when a script is specified for an interface
type that's inappropriate for that particular hypervisor.
(NB: There was earlier discussion of solving this problem by adding a
VALIDATE flag to all libvirt APIs that accept XML, which would cause
the XML to be validated against the RNG files. One statement during
that discussion was that the RNG shouldn't contain hypervisor-specific
things, though, and a proper solution to this problem would require
that (again, because a script for an interface of type "bridge" is
accepted by xen, but not by qemu).
Commit b434329 has a logic bug: seclabel overrides don't set
def->type, but the default value is 0 (aka static). Restarting
libvirtd would thus reject the XML for any domain with an
override of <seclabel relabel='no'/> (which happens quite
easily if a disk image lives on NFS), with a message:
2012-01-04 22:29:40.949+0000: 6769: error : virSecurityLabelDefParseXMLHelper:2593 : XML error: security label is missing
Fix the logic to never read from an override's def->type, and
to allow a missing <label> subelement when relabel is no. There's
a lot of stupid double-negatives in the code (!norelabel) because
of the way that we want the zero-initialized defaults to behave.
* src/conf/domain_conf.c (virSecurityLabelDefParseXMLHelper): Use
type field from correct location.