Commit Graph

1257 Commits

Author SHA1 Message Date
Jiri Denemark
1640fafc6a docs: Generate documentation for virTypedParams* APIs 2013-01-21 18:41:26 +01:00
Jiri Denemark
c72e327456 docs: event.c source file was renamed as virevent.c 2013-01-21 18:40:28 +01:00
Viktor Mihajlovski
8691551070 build: Fix RPM build errors related to libvirt-lxc API
Added missing entries to makefile and spec.

Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2013-01-18 09:48:17 -07:00
Claudio Bley
25feed14db docs: Add some style and color to the HTML documentation
Signed-off-by: Claudio Bley <cbley@av-test.de>
2013-01-18 15:36:38 +01:00
Claudio Bley
cb022b6bc7 docs: don't use <i> and <tt> HTML tags
Use of <tt> is discouraged in HTML 4.x and has finally been obsoleted
in HTML 5. Likewise for the <i> tag.

Using tables for layout is (widely) considered bad style, too.

Use defintion lists, definition term and defintion description
elements instead.

Signed-off-by: Claudio Bley <cbley@av-test.de>
2013-01-18 15:36:38 +01:00
Claudio Bley
458dd20da9 docs: Assign classes to documentation elements
In CSS the following class names are available:

* keyword     (keywords like "typedef", "struct")
* type        (types like "int", "void*")
* comment     (comments after members of enums or structs)
* directive   (preprocessor directives, #define)
* undisclosed (text saying that the API is not public)

Additionally, kill all of the left-over "programlisting" class
assignments. There are no CSS rules for them.

Signed-off-by: Claudio Bley <cbley@av-test.de>
2013-01-18 15:36:38 +01:00
Daniel P. Berrange
3d1596b048 Introduce an LXC specific public API & library
This patch introduces support for LXC specific public APIs. In
common with what was done for QEMU, this creates a libvirt_lxc.so
library and libvirt/libvirt-lxc.h header file.

The actual APIs are

  int virDomainLxcOpenNamespace(virDomainPtr domain,
                                int **fdlist,
                                unsigned int flags);

  int virDomainLxcEnterNamespace(virDomainPtr domain,
                                 unsigned int nfdlist,
                                 int *fdlist,
                                 unsigned int *noldfdlist,
                                 int **oldfdlist,
                                 unsigned int flags);

which provide a way to use the setns() system call to move the
calling process into the container's namespace. It is not
practical to write in a generically applicable manner. The
nearest that we could get to such an API would be an API which
allows to pass a command + argv to be executed inside a
container. Even if we had such a generic API, this LXC specific
API is still useful, because it allows the caller to maintain
the current process context, in particular any I/O streams they
have open.

NB the virDomainLxcEnterNamespace() API is special in that it
runs client side, so does not involve the internal driver API.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:58:34 +00:00
Daniel P. Berrange
6f736c83e5 Convert HAVE_NUMACTL to WITH_NUMACTL
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-01-14 13:25:06 +00:00
Claudio Bley
bf1786b6d5 docs: restrict the set of characters for info keys
When parsing the top level comment of a file, apibuild.py used
to split on any ':' character of a line regarding the first part
as a key for a setting, e.g. "Summary". The second part would then
be assigned as the value for that key.

This means you could not use a ':' character inside those comments
without ill effects.

Now, a key must consist solely of alphanumeric characters, '_' or '.'.
2013-01-14 09:18:43 +01:00
Claudio Bley
833e1493ed docs: simplify code 2013-01-14 09:18:43 +01:00
Eric Blake
a2acdb3dd2 docs: mention git rename detection
I've noticed a number of people sending patches with file
renames not compressed, so we might as well document how to
set this up.  (Git won't do it by default, for back-compat
reasons)

* docs/hacking.html.in: Add git config tip.
* HACKING: Regenerate.
2013-01-11 10:30:49 -07:00
Eric Blake
ed4bbe6bc4 docs: add some more hacking tips
Based on a suggestion by John Ferlan:
https://www.redhat.com/archives/libvir-list/2013-January/msg00158.html

* docs/hacking.html.in: Add some commit message instructions.
Mention the ./run script.
* HACKING: Regenerate.
2013-01-11 10:30:49 -07:00
Laine Stump
7a4bf34b56 docs: fix typo in isa-serial additions
This was preventing make rpm from completing.
2013-01-10 14:26:11 -05:00
Guannan Ren
29d37818fb network: fix typos and docs 2013-01-10 21:46:22 +08:00
Guannan Ren
e3a04455fa qemu: add usb-serial support
Add an optional 'type' attribute to <target> element of serial port
device. There are two choices for its value, 'isa-serial' and
'usb-serial'. For backward compatibility, when attribute 'type' is
missing the 'isa-serial' will be chosen as before.

Libvirt XML sample

    <serial type='pty'>
      <target type='usb-serial' port='0'/>
      <address type='usb' bus='0' port='1'/>
    </serial>

qemu commandline:

qemu ${other_vm_args}              \
    -chardev pty,id=charserial0    \
    -device usb-serial,chardev=charserial0,id=serial0,bus=usb.0,port=1
2013-01-10 21:29:20 +08:00
Claudio Bley
3b54b2e345 docs: break longer text into paragraphs in HTML
Libvirt's HTML documentation is not as easy to the eyes as it could
be since long text has no visual breaks.

Take advantage of the formatting in documentation comments and wrap
each part separated by two consecutive \n into a HTML <p> element.
2013-01-09 08:01:28 +01:00
Claudio Bley
1e8b4b5810 docs: remove duplicate check in index.add 2013-01-08 11:45:47 +01:00
J.B. Joret
d760255d01 S390: Add SCLP console front end support
The SCLP console is the native console type for s390 and is preferred
over the virtio console as it doesn't require special drivers and
is more efficient. Recent versions of QEMU come with SCLP support
which is hereby enabled.

The new target types 'sclp' and 'sclplm' can be used to specify a
SCLP console. Adding documentation, domain schema and XML processing
support.

Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
2013-01-08 11:37:52 +01:00
Osier Yang
b9c57e7b0d docs: Add docs and rng schema for new XML tag sgio
This introduces new XML tag "sgio" for disk, its valid values
are "filtered" and "unfiltered", setting it as "filtered" will
set the disk's unpriv_sgio to 0, and "unfiltered" to set it
as 1, which allows the unprivileged SG_IO commands.
2013-01-07 21:37:24 +08:00
Daniel P. Berrange
f24404a324 Rename virterror.c virterror_internal.h to virerror.{c,h} 2012-12-21 11:19:50 +00:00
Jiri Denemark
4ed80c76c5 docs: Fix documentation for readonly element 2012-12-18 14:09:01 +01:00
Daniel P. Berrange
aae0fc2a92 Add support for <hostdev mode="capabilities">
The <hostdev> device type has long had a redundant "mode"
attribute, which has always been "subsys". This finally
introduces a new mode "capabilities", which will be used
by the LXC driver for device assignment. Since container
based virtualization uses a single kernel, the idea of
assigning physical PCI devices doesn't make sense. It is
still reasonable to assign USB devices, but for assigning
arbitrary nodes in /dev, the new 'capabilities' mode is
to be used.

The first capability support is 'storage', which is for
assignment of block devices. Functionally this is really
pretty similar to the <disk> support. The only difference
is the device node name is identical in both host and
container namespaces.

    <hostdev mode='capabilities' type='storage'>
      <source>
        <block>/dev/sdf1</block>
      </source>
    </hostdev>

The second capability support is 'misc', which is for
assignment of character devices. There is no existing
parallel to this. Again the device node is the same
inside & outside the container.

    <hostdev mode='capabilities' type='misc'>
      <source>
        <char>/dev/input/event3</char>
      </source>
    </hostdev>

The reason for keeping the char & storage devices
separate in the domain XML, is to mirror the split
in the node device XML. NB the node device XML does
not yet report character devices, but that's another
new patch to come

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-12-17 17:50:50 +00:00
Peter Krempa
c17b16d1be docs: Replace </br> with <br/> in docs/news.html.in 2012-12-17 11:02:23 +01:00
Guannan Ren
09938bb3b0 conf: add optional attribte primary to video <model> element
If there are multiple video devices
primary = 'yes' marks this video device as the primary one.
The rest are secondary video devices. No more than one could be
mark as primary. If none of them has primary attribute, the first
one will be the primary by default like what it was.
The reason of this changing is that for qemu, only one primary video
device is permitted which can be of any type. For secondary video
devices, only qxl is allowd. Primary attribute removes the restriction
that the first have to be the primary one.

We always put the primary video device into the first position of
video device structure array after parsing.
2012-12-17 14:01:20 +08:00
Daniel Veillard
34ca568497 Release of libvirt-1.0.1
- configure.ac docs/news.html.in: update for the release
- po/*.po: updated from transifex
2012-12-17 11:36:37 +08:00
Eric Blake
9821f8f6cf docs: fix some typos in examples
As detected in https://bugzilla.redhat.com/show_bug.cgi?id=887187

* docs/formatdomain.html.in: Fix XML typos.
2012-12-14 08:28:57 -07:00
Michał Łomnicki
c86f53d5b2 docs: Fix location of libvirt.conf and auth.conf
For a unprivileged user libvirt.conf and auth.conf are looked up in
$XDG_CONFIG_HOME but the docs incorrectly state that it's $XDG_CONFIG_DIR.
2012-12-14 13:35:03 +01:00
Jiri Denemark
748f6dd0c3 docs: Document offline migration 2012-12-11 20:46:31 +01:00
Michal Privoznik
ec6474b245 bandwidth: add new 'floor' attribute
This is however supported only on domain interfaces with
type='network'. Moreover, target network needs to have at least
inbound QoS set. This is required by hierarchical traffic shaping.

From now on, the required attribute for <inbound/> is either 'average'
(old) or 'floor' (new). This new attribute can be used just for
interfaces type of network (<interface type='network'/>) currently.
2012-12-11 18:35:12 +01:00
Gene Czarcinski
2d5cd1d724 network: add support for DHCPv6
The DHCPv6 support includes IPV6 dhcp-range and dhcp-host for one
IPv6 subnetwork on one interface.  This support will only work
if dnsmasq version >= 2.64; otherwise an error occurs if
dhcp-range or dhcp-host is specified for an IPv6 address.

Essentially, this change provides the same DHCP support for IPv6
that has been available for IPv4.

With dnsmasq >= 2.64, support for the RA service is also now provided
by dnsmasq (radvd is no longer used/started). (Although at least one
version of dnsmasq prior to 2.64 "supported" IPv6 Router
Advertisement, there were bugs (fixed in 2.64) that rendered it
unusable.)

Documentation and the network schema has been updated
to reflect the new support.
2012-12-11 05:49:45 -05:00
Osier Yang
b718ded39a qemu: Allow the user to specify vendor and product for disk
QEMU supports setting vendor and product strings for disk since
1.2.0 (only scsi-disk, scsi-hd, scsi-cd support it), this patch
exposes it with new XML elements <vendor> and <product> of disk
device.
2012-12-07 16:53:27 +08:00
Jim Fehlig
dfa1e1dd53 Convert libxl driver to Xen 4.2
Based on a patch originally authored by Daniel De Graaf

  http://lists.xen.org/archives/html/xen-devel/2012-05/msg00565.html

This patch converts the Xen libxl driver to support only Xen >= 4.2.
Support for Xen 4.1 libxl is dropped since that version of libxl is
designated 'technology preview' only and is incompatible with Xen 4.2
libxl.  Additionally, the default toolstack in Xen 4.1 is still xend,
for which libvirt has a stable, functional driver.
2012-12-06 16:15:54 -07:00
Gene Czarcinski
705e67d40b network: allow guest to guest IPv6 without gateway definition
This patch adds the capability for virtual guests to do IPv6
communication via a virtual network interface with no IPv6 (gateway)
addresses specified.  This capability has always been enabled by
default for IPv4, but disabled for IPv6 for security concerns, and
because it requires the ip6tables command to be operational (which
isn't the case on a system with the ipv6 module completely disabled).

This patch adds a new attribute "ipv6" at the toplevel of a <network>
object.  If ipv6='yes', the extra ip6tables rules required to permite
inter-guest communications are added when the network is started. If
it is 'no', or not present, those rules will not be added; thus the
default behavior doesn't change, so there should be no compatibility
issues with any existing installations.

Note that virtual guests cannot communication with the virtualization
host via this interface, because the following kernel tunable has
been set:

   net.ipv6.conf.<bridge_interface_name>.disable_ipv6 = 1

This assures that the bridge interface will not have an IPv6
link-local (fe80::) address.

To control this behavior so that it is not enabled by default, the parameter
ipv6='yes' on the <network> statement has been added.

Documentation related to this patch has been updated.
The network schema has also been updated.
2012-12-05 14:58:32 -05:00
Harsh Prateek Bora
a2d2b80fbd Add Gluster protocol as supported network disk backend
This patch introduces the RNG schema and updates necessary data strucutures
to allow various hypervisors to make use of Gluster protocol as one of the
supported network disk backend. Next patch will add support to make use of
this feature in Qemu since it now supports Gluster protocol as one of the
network based storage backend.

Two new optional attributes for <host> element are introduced - 'transport'
and 'socket'. Valid transport values are tcp, unix or rdma. If none specified,
tcp is assumed. If transport is unix, socket specifies path to unix socket.

This patch allows users to specify disks on gluster backends like this:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'/>
      <source protocol='gluster' name='Volume1/image'>
        <host name='example.org' port='6000' transport='tcp'/>
      </source>
      <target dev='vda' bus='virtio'/>
    </disk>

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'/>
      <source protocol='gluster' name='Volume2/image'>
        <host transport='unix' socket='/path/to/sock'/>
      </source>
      <target dev='vdb' bus='virtio'/>
    </disk>

Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
2012-11-27 10:19:22 +01:00
Ján Tomko
e628dbfbef docs: Fix a few spaces
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2012-11-21 18:22:37 +01:00
Ján Tomko
08c1435f05 docs: boot order for host and redirected USB devices 2012-11-21 18:21:51 +01:00
Ján Tomko
a4c19459aa qemu: add bootindex for usb-host and usb-redir devices
Allow bootindex to be specified for redirected USB devices and host USB
devices.

Bug: https://bugzilla.redhat.com/show_bug.cgi?id=805414
2012-11-14 19:03:18 -07:00
Eric Blake
4201a7ea1c snapshot: new XML for external system checkpoint
Each <domainsnapshot> can now contain an optional <memory>
element that describes how the VM state was handled, similar
to disk snapshots.  The new element will always appear in
output; for back-compat, an input that lacks the element will
assume 'no' or 'internal' according to the domain state.

Along with this change, it is now possible to pass <disks> in
the XML for an offline snapshot; this also needs to be wired up
in a future patch, to make it possible to choose internal vs.
external on a per-disk basis for each disk in an offline domain.
At that point, using the --disk-only flag for an offline domain
will be able to work.

For some examples below, remember that qemu supports the
following snapshot actions:

qemu-img: offline external and internal disk
savevm: online internal VM and disk
migrate: online external VM
transaction: online external disk

=====
<domainsnapshot>
  <memory snapshot='no'/>
  ...
</domainsnapshot>

implies that there is no VM state saved (mandatory for
offline and disk-only snapshots, not possible otherwise);
using qemu-img for offline domains and transaction for online.

=====
<domainsnapshot>
  <memory snapshot='internal'/>
  ...
</domainsnapshot>

state is saved inside one of the disks (as in qemu's 'savevm'
system checkpoint implementation).  If needed in the future,
we can also add an attribute pointing out _which_ disk saved
the internal state; maybe disk='vda'.

=====
<domainsnapshot>
  <memory snapshot='external' file='/path/to/state'/>
  ...
</domainsnapshot>

This is not wired up yet, but future patches will allow this to
control a combination of 'virsh save /path/to/state' plus disk
snapshots from the same point in time.

=====

So for 1.0.1 (and later, as needed), I plan to implement this table
of combinations, with '*' designating new code and '+' designating
existing code reached through new combinations of xml and/or the
existing DISK_ONLY flag:

domain  memory  disk   disk-only | result
-----------------------------------------
offline omit    omit   any       | memory=no disk=int, via qemu-img
offline no      omit   any       |+memory=no disk=int, via qemu-img
offline omit/no no     any       | invalid combination (nothing to snapshot)
offline omit/no int    any       |+memory=no disk=int, via qemu-img
offline omit/no ext    any       |*memory=no disk=ext, via qemu-img
offline int/ext any    any       | invalid combination (no memory to save)
online  omit    omit   off       | memory=int disk=int, via savevm
online  omit    omit   on        | memory=no disk=default, via transaction
online  omit    no/ext off       | unsupported for now
online  omit    no     on        | invalid combination (nothing to snapshot)
online  omit    ext    on        | memory=no disk=ext, via transaction
online  omit    int    off       |+memory=int disk=int, via savevm
online  omit    int    on        | unsupported for now
online  no      omit   any       |+memory=no disk=default, via transaction
online  no      no     any       | invalid combination (nothing to snapshot)
online  no      int    any       | unsupported for now
online  no      ext    any       |+memory=no disk=ext, via transaction
online  int/ext any    on        | invalid combination (disk-only vs. memory)
online  int     omit   off       |+memory=int disk=int, via savevm
online  int     no/ext off       | unsupported for now
online  int     int    off       |+memory=int disk=int, via savevm
online  ext     omit   off       |*memory=ext disk=default, via migrate+trans
online  ext     no     off       |+memory=ext disk=no, via migrate
online  ext     int    off       | unsupported for now
online  ext     ext    off       |*memory=ext disk=ext, via migrate+transaction

* docs/schemas/domainsnapshot.rng (memory): New RNG element.
* docs/formatsnapshot.html.in: Document it.
* src/conf/snapshot_conf.h (virDomainSnapshotDef): New fields.
* src/conf/domain_conf.c (virDomainSnapshotDefFree)
(virDomainSnapshotDefParseString, virDomainSnapshotDefFormat):
Manage new fields.
* tests/domainsnapshotxml2xmltest.c: New test.
* tests/domainsnapshotxml2xmlin/*.xml: Update existing tests.
* tests/domainsnapshotxml2xmlout/*.xml: Likewise.
2012-11-02 09:56:23 -06:00
Daniel P. Berrange
a3e95abeb5 Document bracket whitespace rules & add syntax-check rule
This documents the following whitespace rules

      if(foo)   // Bad
      if (foo)  // Good

      int foo (int wizz)  // Bad
      int foo(int wizz)   // Good

      bar = foo (wizz);  // Bad
      bar = foo(wizz);   // Good

      typedef int (*foo) (int wizz);  // Bad
      typedef int (*foo)(int wizz);   // Good

      int foo( int wizz );  // Bad
      int foo(int wizz);    // Good

There is a syntax-check rule extension to validate all these rules.
Checking for 'function (...args...)' is quite difficult since it
needs to ignore valid usage with keywords like 'if (...test...)'
and while/for/switch. It must also ignore source comments and
quoted strings.

It is not possible todo this with a simple regex in the normal
syntax-check style. So a short Perl script is created instead
to analyse the source. In practice this works well enough. The
only thing it can't cope with is multi-line quoted strings of
the form

 "start of string\
more lines\
more line\
the end"

but this can and should be written as

 "start of string"
 "more lines"
 "more line"
 "the end"

with this simple change, the bracket checking script does not
have any false positives across libvirt source, provided it
is only run against .c files. It is not practical to run it
against .h files, since those use whitespace extensively to
get alignment (though this is somewhat inconsistent and could
arguably be fixed).

The only limitation is that it cannot detect a violation where
the first arg starts with a '*', eg

   foo(*wizz);

since this generates too many false positives on function
typedefs which can't be supressed efficiently.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2012-11-02 14:00:32 +00:00
Daniel Veillard
2b435c153e Release of libvirt-1.0.0
* configure.ac docs/news.html.in libvirt.spec.in: update for the new release
* po/*.po*: update from transifex, a lot of added support e.g. Indian
  languages, and regenerate
2012-11-02 12:08:11 +08:00
Michal Privoznik
9af1b30da3 sanlock: Introduce 'user' and 'group' conf variables
through which user set under what permissions does sanlock
daemon run so libvirt will set the same permissions for
files exposed to it.
2012-10-30 10:12:10 +01:00
Philipp Hahn
7083cdc7bd documentation: HTML tag fix
Replace '%' by '&' for correct escaping of '>' in Domain specification.

Signed-off-by: Philipp Hahn <hahn@univention.de>
2012-10-26 09:53:41 -04:00
Matthias Bolte
1e7cd39511 esx: Update version checks for vSphere 5.1
Also remove warnings for upcoming versions. There hadn't been any
compatibility problems with new ESX version over the whole lifetime
of the ESX driver, so I don't expect any in the future.

Update documentation to mention vSphere 5.x support.
2012-10-24 19:50:28 +02:00
Cole Robinson
7146d41634 docs: Fix installation of internals/*.html
We were just installing them in the top level html directory, which
broke navigation and overwrote other pages.

https://bugzilla.redhat.com/show_bug.cgi?id=837825
2012-10-22 16:15:12 -04:00
Cole Robinson
fe772f24a6 daemon: Avoid 'Could not find keytab file' in syslog
On F17 at least, every time libvirtd starts we get this in syslog:

libvirtd: Could not find keytab file: /etc/libvirt/krb5.tab: No such file or directory

This comes from cyrus-sasl, and happens regardless of whether the
gssapi plugin is requested, which is what actually uses
/etc/libvirt/krb5.tab.

While cyrus-sasl shouldn't complain, we can easily make it shut up by
commenting out the keytab value by default.

Also update the keytab comment to the more modern one from qemu's
sasl config file.
2012-10-21 13:21:07 -04:00
Eric Blake
e2c41e4860 storage: match RNG to supported driver types
At one point, the code passed through arbitrary strings for file
formats, which supposedly lets qemu handle a new file type even
before libvirt has been taught to handle it.  However, to properly
label files, libvirt has to learn the file type anyway, so we
might as well make our life easier by only accepting file types
that we are prepared to handle.  This patch lets the RNG validation
ensure that only known strings are let through.

* docs/schemas/domaincommon.rng (driverFormat): Limit to list of
supported strings.
* docs/schemas/domainsnapshot.rng (driver): Likewise.
2012-10-19 17:35:09 -06:00
Peter Krempa
cc922fddc3 conf: Add support for HyperV Enlightenment features
Hypervisors are starting to support HyperV Enlightenment features that
improve behavior of guests running Microsoft Windows operating systems.

This patch adds support for the "relaxed" feature that improves timer
behavior and also establishes a framework to add these features in
future.
2012-10-18 12:22:50 +02:00
Eric Blake
819c8ce043 maint: prepare for next release number
Given Daniel's announcement[1], code targetting the next release will
be in 1.0.0, not 0.10.3.  Changed mechanically with:

for f in $(git grep -l '0\(.\)10\13\b') ; do
   sed -i -e 's/0\(.\)10\13/1\10\10/g' $f
done

[1]https://www.redhat.com/archives/libvir-list/2012-October/msg00403.html

* docs/formatdomain.html.in: Use 1.0.0 for next release.
* src/interface/interface_backend_udev.c: Likewise.
2012-10-16 08:09:01 -06:00
Osier Yang
f108944ae0 doc: Sort out the relationship between <vcpu>, <vcpupin>, and <emulatorpin>
These 3 elements conflicts with each other in either the doc
or the underlying codes.

Current problems:

Problem 1:

The doc shouldn't simply say "These settings are superseded
by CPU tuning. " for element <vcpu>. As except the tuning, <vcpu>
allows to specify the current, maxmum vcpu number. Apart from that,
<vcpu> also allows to specify the placement as "auto", which binds
the domain process to the advisory nodeset from numad.

Problem 2:

Doc for <vcpu> says its "cpuset" specify the physical CPUs
that the vcpus can be pinned. But it's not the truth, as
actually it only pin domain process to the specified physical
CPUs. So either it's a document bug, or code bug.

Problem 3:

Doc for <vcpupin> says it supersed "cpuset" of <vcpu>, it's
not quite correct, as each <vcpupin> specify the pinning policy
only for one vcpu. How about the ones which doesn't have
<vcpupin> specified? it says the vcpu will be pinned to all
available physical CPUs, but what's the meaning of attribute
"cpuset" of <vcpu> then?

Problem 4:

Doc for <emulatorpin> says it pin the emulator threads (domain
process in other context, perhaps another follow up patch to
cleanup the inconsistency is needed) to the physical CPUs
specified its attribute "cpuset". Which conflicts with
<vcpu>'s "cpuset". And actually in the underlying codes,
it set the affinity for domain process twice if both
"cpuset" for <vcpu> and <emulatorpin> are specified,
and <emulatorpin>'s pinning will override <vcpu>'s.

Problem 5:

When "placement" of <vcpu> is "auto" (I.e. uses numad to
get the advisory nodeset to which the domain process is
pinned to), it will also be overridden by <emulatorpin>,

This patch is trying to sort out the conflicts or bugs by:

1) Don't say <vcpu> is superseded by <cputune>

2) Keep the semanteme for "cpuset" of <vcpu> (I.e. Still says it
   specify the physical CPUs the virtual CPUs). But modifying it
   to mention it also set the pinning policy for domain process,
   and the CPU placement of domain process specified by "cpuset"
   of <vcpu> will be ingored if <emulatorpin> specified, and
   similary, the CPU placement of vcpu thread will be ignored
   if it has <vcpupin> specified, for vcpu which doesn't have
   <vcpupin> specified, it inherits "cpuset" of <vcpu>.

3) Don't say <vcpu> is supersed by <vcpupin>. If neither <vcpupin>
   nor "cpuset" of <vcpu> is specified, the vcpu will be pinned
   to all available pCPUs.

4) If neither <emulatorpin> nor "cpuset" of <vcpu> is specified,
   the domain process (emulator threads in the context) will be
   pinned to all available pCPUs.

5) If "placement" of <vcpu> is "auto", <emulatorpin> is not allowed.

6) hotplugged vcpus will also inherit "cpuset" of <vcpu>

Codes changes according to above document changes:

1) Inherit def->cpumask for each vcpu which doesn't have <vcpupin>
   specified, during parsing.

2) ping the vcpu which doesn't have <vcpupin> specified to def->cpumask
   either by cgroup for sched_setaffinity(2), which is actually done
   by 1).

3) Error out if "placement" == "auto", and <emulatorpin> is specified.
   Otherwise, <emulatorpin> is honored, and "cpuset" of <cpuset> is
   ignored.

4) Setup cgroup for each hotplugged vcpu, and setup the pinning policy
   by either cgroup or sched_setaffinity(2).

5) Remove cgroup and <vcpupin> for each hot unplugged vcpu.

Patches are following (6 in total except this patch)
2012-10-15 12:13:34 +08:00
Jiri Denemark
f95560b3fe conf: Mark missing optional USB devices in domain XML
When startupPolicy set for a USB devices allows such device to be
missing, there was no way this could be detected from domain XML. With
this patch, libvirt emits a new missing='yes' attribute for such devices
when active domain XML is generated.
2012-10-12 10:55:32 +02:00