9174 Commits

Author SHA1 Message Date
Jiri Denemark
d8d4aa01d8 rpc: Fix client crash when server drops connection
Despite the comment stating virNetClientIncomingEvent handler should
never be called with either client->haveTheBuck or client->wantClose
set, there is a sequence of events that may lead to both booleans being
true when virNetClientIncomingEvent is called. However, when that
happens, we must not immediately close the socket as there are other
threads waiting for the buck and they would cause SIGSEGV once they are
woken up after the socket was closed. Another thing is we should clear
all remaining calls in the queue after closing the socket.

The situation that can lead to the crash involves three threads, one of
them running event loop and the other two calling libvirt APIs. The
event loop thread detects an event on client->sock and calls
virNetClientIncomingEvent handler. But before the handler gets a chance
to lock client, the other two threads (T1 and T2) start calling some
APIs. T1 gets the buck and detects EOF on client->sock while processing
its RPC call. Since T2 is waiting for its own call, T1 passes the buck
on to it and unlocks client. But before T2 gets the signal, the event
loop thread wakes up, does its job and closes client->sock. The crash
happens when T2 actually wakes up and tries to do its job using a closed
client->sock.
2013-03-27 09:00:38 +01:00
Jiri Denemark
a1fe02f0e9 log: Separate thread ID from timestemp in ring buffer
When we write a log message into a log, we separate thread ID from
timestamp using ": ". However, when storing the message into the ring
buffer, we omitted the separator, e.g.:

    2013-02-27 11:49:11.852+00003745: ...
2013-03-27 09:00:35 +01:00
Guannan Ren
a950f03e16 conf: fix a failure when detaching a usb device
#virsh detach-device $guest usb.xml
 error: Failed to detach device from usb2.xml
 error: operation failed: host usb device vendor=0x0951 \
 product=0x1625 not found

This regresstion is due to a typo in matching function. The first
argument is always the usb device that we are checking for. If the
usb xml file provided by user contains bus and device info, we try
to search it by them, otherwise, we use vendor and product info.

The bug occurred only when detaching a usb device with no bus and
device info provided in the usb xml file.
2013-03-27 10:38:08 +08:00
Guido Günther
ea2e31fa5b qemu: Don't set address type too early during virtio disk hotplug
f946462e14ac036357b7c11ce5c23f94a3ee4e49 changed behavior by settings
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI upfront. If we do so before invoking
qemuDomainPCIAddressEnsureAddr we merely try to set the PCI slot via
qemuDomainPCIAddressReserveSlot instead reserving a new address via
qemuDomainPCIAddressSetNextAddr which fails with

$ ~/run-tck-test domain/200-disk-hotplug.t
./scripts/domain/200-disk-hotplug.t .. # Creating a new transient domain
./scripts/domain/200-disk-hotplug.t .. 1/5 # Attaching the new disk /var/lib/jenkins/jobs/libvirt-tck-build/workspace/scratchdir/200-disk-hotplug/extra.img

 #   Failed test 'disk has been attached'
 #   at ./scripts/domain/200-disk-hotplug.t line 67.
 # died: Sys::Virt::Error (libvirt error code: 1, message: internal error unable to reserve PCI address 0:0:0.0
 # )
2013-03-26 18:54:41 +01:00
Michal Privoznik
ceb31795af qemu: Set migration FD blocking
Since we switched from direct host migration scheme to the one,
where we connect to the destination and then just pass a FD to a
qemu, we have uncovered a qemu bug. Qemu expects migration FD to
block. However, we are passing a nonblocking one which results in
cryptic error messages like:

  qemu: warning: error while loading state section id 2
  load of migration failed

The bug is already known to Qemu folks, but we should workaround
already released Qemus. Patch has been originally proposed by Stefan
Hajnoczi <stefanha@gmail.com>
2013-03-26 17:16:27 +01:00
Martin Kletzander
d8ed386c07 Fix virConnectOpen.*() name requirements
virConnectOpenAuth didn't require 'name' to be specified (VIR_DEBUG
used NULLSTR() for the output) and by default, if name == NULL, the
default connection uri is used.  This was not indicated in the
documentation and wasn't checked for in other API's VIR_DEBUG outputs.
2013-03-26 15:44:32 +01:00
Eric Blake
7524cd893e Revert "qemu: detect multi-head qxl via more than version check"
This reverts commit 5ac846e42e5b7e0475f6aa9cc1e0b0c8dac84d44.

After further discussions with Alon Levy, I learned the following:

The use of '-vga qxl' vs. '-device qxl-vga' is completely orthogonal
to whether ram_size can be exposed.  Downstream distros are interested
in backporting support for multi-head qxl, but this can be done in
one of two ways:
1. Support one head per PCI device.  If you do this, then it makes
sense to have full control over the PCI address of each device. For
full control, you need '-device qxl-vga' instead of '-vga qxl'.
2. Support multiple heads through a single PCI device.  If you do
this, then you need to allocate more RAM to that PCI device (enough
ram to cover the multiple screens).  Here, the device is hard-coded
to 0:0:2.0, both in qemu and libvirt code.

Apparently, backporting ram_size changes to allow multiple heads in
a single device is much easier than backporting multiple device
support.  Furthermore, the presence or absence of qxl-vga.surfaces
is no different than the presence or absence of qxl-vga.ram_size;
both properties can be applied regardless of whether you have one
PCI device (-vga qxl) or multiple (-device qxl-vga), so this property
is NOT a good witness of whether '-device qxl-vga' support has been
backported.

Downstream RHEL will NOT be using this patch; and worse, leaving this
patch in risks doing the wrong thing if compiling upstream libvirt
on RHEL, so the best course of action is to revert it.  That means
that libvirt will go back to only using '-device qxl-vga' for qemu
>= 1.2, but this is just fine because we know of no distros that plan
on backporting multiple PCI address support to any older version of
qemu.  Meanwhile, downstream can still use ram_size to pack multiple
heads through a single PCI device.
2013-03-25 08:38:35 -06:00
Osier Yang
f90af6914e util: Fix bug of managing vport
The string written to "vport_create" or "vport_delete" should
be "wwnn:wwpn", but not "wwpn:wwnn".
2013-03-25 21:18:14 +08:00
Osier Yang
9a3ff01d7f nodedev: Fix the improper logic when enumerating SRIOV VF
virPCIGetVirtualFunctions returns 0 even if there is no "virtfn"
entry under the device sysfs path.

And virPCIGetVirtualFunctions returns -1 when it fails to get
the PCI config space of one VF, however, with keeping the
the VFs already detected.

That's why udevProcessPCI and gather_pci_cap use logic like:

if (!virPCIGetVirtualFunctions(syspath,
                               &data->pci_dev.virtual_functions,
                               &data->pci_dev.num_virtual_functions) ||
    data->pci_dev.num_virtual_functions > 0)
    data->pci_dev.flags |= VIR_NODE_DEV_CAP_FLAG_PCI_VIRTUAL_FUNCTION;

to tag the PCI device with "virtual_function" cap.

However, this results in a VF will aslo get "virtual_function" cap.

This patch fixes it by:
  * Ignoring the VF which has failure of getting PCI config space
    (given that the successfully detected VFs are kept , it makes
    sense to not give up on the failure of one VF too) with a warning,
    so virPCIGetVirtualFunctions will not return -1 except out of memory.

  * Free the allocated *virtual_functions when out of memory

And thus the logic can be changed to:

    /* Out of memory */
    int ret = virPCIGetVirtualFunctions(syspath,
                                        &data->pci_dev.virtual_functions,
                                        &data->pci_dev.num_virtual_functions);

    if (ret < 0 )
        goto out;
    if (data->pci_dev.num_virtual_functions > 0)
        data->pci_dev.flags |= VIR_NODE_DEV_CAP_FLAG_PCI_VIRTUAL_FUNCTION;
2013-03-25 21:14:48 +08:00
Osier Yang
96d3086a4f nodedev: Abstract nodeDeviceVportCreateDelete as util function
This abstracts nodeDeviceVportCreateDelete as an util function
virManageVport, which can be further used by later storage patches
(to support persistent vHBA, I don't want to create the vHBA
using the public API, which is not good).
2013-03-25 20:46:05 +08:00
Osier Yang
448be8f706 nodedev: Dump max vports and vports in use for HBA's XML
This enrichs HBA's xml by dumping the number of max vports and
vports in use. Format is like:

  <capability type='vport_ops'>
    <max_vports>164</max_vports>
    <vports>5</vports>
  </capability>

* docs/formatnode.html.in: (Document the new XML)
* docs/schemas/nodedev.rng: (Add the schema)
* src/conf/node_device_conf.h: (New member for data.scsi_host)
* src/node_device/node_device_linux_sysfs.c: (Collect the value of
  max_vports and vports)
2013-03-25 20:46:05 +08:00
Osier Yang
4360a09844 nodedev: Refactor the helpers
This adds two util functions (virIsCapableFCHost and virIsCapableVport),
and rename helper check_fc_host_linux as detect_scsi_host_caps,
check_capable_vport_linux is removed, as it's abstracted to the util
function virIsCapableVport. detect_scsi_host_caps nows detect both
the fc_host and vport_ops capabilities. "stat(2)" is replaced with
"access(2)" for saving.

* src/util/virutil.h:
  - Declare virIsCapableFCHost and virIsCapableVport
* src/util/virutil.c:
  - Implement virIsCapableFCHost and virIsCapableVport
* src/node_device/node_device_linux_sysfs.c:
  - Remove check_capable_vport_linux
  - Rename check_fc_host_linux as detect_scsi_host_caps, and refactor
    it a bit to detect both fc_host and vport_os capabilities
* src/node_device/node_device_driver.h:
  - Change/remove the related declarations
* src/node_device/node_device_udev.c: (Use detect_scsi_host_caps)
* src/node_device/node_device_hal.c: (Likewise)
* src/node_device/node_device_driver.c (Likewise)
2013-03-25 20:46:05 +08:00
Osier Yang
d91f7dec46 nodedev: Use access instead of stat
The use of 'stat' in nodeDeviceVportCreateDelete is only to check
if the file exists or not, it's a bit overkill, and safe to replace
with the wrapper of access(2) (virFileExists).
2013-03-25 20:46:05 +08:00
Osier Yang
244ce462e2 util: Add one helper virReadFCHost to read the value of fc_host entry
"open_wwn_file" in node_device_linux_sysfs.c is redundant, on one
hand it duplicates work of virFileReadAll, on the other hand, it's
waste to use a function for it, as there is no other users of it.
So I don't see why the file opening work cannot be done in
"read_wwn_linux".

"read_wwn_linux" can be abstracted as an util function. As what all
it does is to read the sysfs entry.

So this patch removes "open_wwn_file", and abstract "read_wwn_linux"
as an util function "virReadFCHost" (a more general name, because
after changes, it can read each of the fc_host entry now).

* src/util/virutil.h: (Declare virReadFCHost)
* src/util/virutil.c: (Implement virReadFCHost)
* src/node_device/node_device_linux_sysfs.c: (Remove open_wwn_file,
  and read_wwn_linux)
src/node_device/node_device_driver.h: (Remove the declaration of
  read_wwn_linux, and the related macros)
src/libvirt_private.syms: (Export virReadFCHost)
2013-03-25 20:46:05 +08:00
Osier Yang
652a2ec630 nodedev: Introduce two new flags for listAll API
VIR_CONNECT_LIST_NODE_DEVICES_CAP_FC_HOST to filter the FC HBA,
and VIR_CONNECT_LIST_NODE_DEVICES_CAP_VPORTS to filter the FC HBA
which supports vport.
2013-03-25 20:46:05 +08:00
Osier Yang
ab4b000188 nodedev: Remove the unused enum
Guess it was created for the fc_host and vports_ops capabilities
purpose, but there is enum virNodeDevScsiHostCapFlags for them,
and enum virNodeDevHBACapType is unused, and actually both
VIR_ENUM_DECL and VIR_ENUM_IMPL use the wrong enum name
"virNodeDevHBACap".
2013-03-25 20:46:05 +08:00
Martin Kletzander
c9c87376f2 lxc: Prevent shutting down the host
When the container has the same '/dev' mount as host (no chroot),
calling domainShutdown(WithFlags) shouldn't shutdown the host it is
running on.
2013-03-23 11:07:57 +01:00
Daniel P. Berrange
8dbe85886c Ensure root filesystem is mounted if a file/block mount.
For a root filesystem with type=file or type=block, the LXC
container was forgetting to actually mount it, before doing
the pivot root step.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
7e1a7444c6 Mount temporary devpts on /var/lib/libvirt/lxc/$NAME.devpts
Currently the lxc controller sets up the devpts instance on
$rootfsdef->src, but this only works if $rootfsdef is using
type=mount. To support type=block or type=file for the root
filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts
for the temporary devpts mount in the controller
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
05f664b12c Move FUSE mount to /var/lib/libvirt/lxc/$NAME.fuse
Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE
filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows
room for other temporary mounts in the same directory
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
d50cb2b115 Fix thread safety in LXC callback handling
Some of the LXC callbacks did not lock the virDomainObjPtr
instance. This caused transient errors like

error: Failed to start domain busy-mount
error: cannot rename file '/var/run/libvirt/lxc/busy-mount.xml.new' as '/var/run/libvirt/lxc/busy-mount.xml': No such file or directory

as 2 threads tried to update the status file concurrently

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 17:27:01 +00:00
Daniel P. Berrange
53cbfc2f10 Remove bogus filtering from virDomainGetRootFilesystem
The virDomainGetRootFilesystem was only returning filesystems
with type=mount. This is bogus - any type of filesystem is
valid as the root, if dst=/.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 17:27:01 +00:00
Jim Fehlig
5ba077dcd0 Fix parsing of bond interface XML
Noticed that parsing bond interface XML containing the miimon element
fails

  <interface type="bond" name="bond0">
    ...
    <bond mode="active-backup">
      <miimon freq="100" carrier="netif"/>
      ...
    </bond>
  </interface>

This configuration does not contain the optional updelay and downdelay
attributes, but parsing will fail due to returning the result of
virXPathULong (a -1 when the attribute doesn't exist) from
virInterfaceDefParseBond after examining the updelay attribute.

While fixing this bug, cleanup the function to use virXPathInt instead
of virXPathULong, and store the result directly instead of using a tmp
variable.  Using virXPathInt actually fixes a potential silent
truncation bug noted by Eric Blake.

Also, there is no cleanup in the error label.  Remove the label,
returning failure where failure occurs and success if the end of the
function is reached.
2013-03-22 09:20:08 -06:00
Ján Tomko
b8fec67cb5 util: fix virAllocVar's comment 2013-03-22 13:05:46 +01:00
Michal Privoznik
70bc623b58 viralloc: Export virAllocTest*
If users build with --enable-test-oom configure option,
they get this error saying, virAllocTest* functions are
not defined within tests/testutils.c.
2013-03-22 12:45:14 +01:00
Daniel P. Berrange
c5f28d0117 Fix free of uninitialized value in LXC numad setup
The 'nodeset' variable was never initialized, causing a later
VIR_FREE(nodeset) to free uninitialized memory.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-22 11:44:35 +00:00
Paolo Bonzini
9f7a9aee37 qemu: add support for LSI MegaRAID SAS1078 (aka megasas) SCSI controller
This does nothing more than adding the new device and capability.
The device is present since QEMU 1.2.0.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:11:14 +08:00
Paolo Bonzini
523207fe8c qemu: pass iscsi authorization credentials
A better way to do this would be to use a configuration file like

   [iscsi "target-name"]
   user = name
   password = pwd

and pass it via -readconfig.  This would remove the username and password
from the "ps" output.  For now, however, keep this solution.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:10:23 +08:00
Paolo Bonzini
6dca6d84ed domain: parse XML for iscsi authorization credentials
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:10:23 +08:00
Paolo Bonzini
adba070122 secret: add iscsi to possible usage types
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:10:23 +08:00
Paolo Bonzini
8110a8249d domain: make port optional for network disks
Only sheepdog actually required it in the code, and we can use 7000 as the
default---the same value that QEMU uses for the simple "sheepdog:VOLUME"
syntax.  With this change, the schema can be fixed to allow no port.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:10:23 +08:00
Paolo Bonzini
c820fbff9f qemu: support passthrough for iscsi disks
This enables usage of commands like persistent reservations.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:10:23 +08:00
Paolo Bonzini
1a308ee015 qemu: add support for libiscsi
libiscsi provides a userspace iSCSI initiator.

The main advantage over the kernel initiator is that it is very
easy to provide different initiator names for VMs on the same host.
Thus libiscsi supports usage of persistent reservations in the VM,
which otherwise would only be possible with NPIV.

libiscsi uses "iscsi" as the scheme, not "iscsi+tcp".  We can change
this in the tests (while remaining backwards-compatible manner, because
QEMU uses TCP as the default transport for both Gluster and NBD).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-22 12:10:22 +08:00
Peter Krempa
a584eaa5ff qemu: Un-mark volume as mirrored/copied if blockjob copy fails
When the blockjob fails for some reason an event is emitted but the disk
wasn't unmarked as being part of a active block copy operation.
2013-03-21 12:32:03 +01:00
Daniel P. Berrange
6e5ad18992 Fix initialization of virIdentityPtr thread locals
Some code mistakenly called virIdentityOnceInit directly
instead of virIdentityInitialize(). This meant that one-time
initializer was run many times with predictably bad results.
2013-03-21 10:58:15 +00:00
Michal Privoznik
cb86e9d39b qemu: s/VIR_ERR_NO_SUPPORT/VIR_ERR_OPERATION_UNSUPPORTED
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
2013-03-21 09:26:15 +01:00
Daniel P. Berrange
e053561e38 Fix linkage of virt-aa-helper with numa library
The recent commit moved some of the use of libnuma out of the
driver code, and into src/util/. It did not, however, update
libvirt_util.la to link against libnuma. This caused linkage
failure with virt-aa-helper, since nothing else caused libnuma
to be pulled onto the linker command line.

The fix removes all reference to NUMACTL_LIBS/CFLAGS from the
various modules in src/Makefile.am and just adds them to the
libvirt_util.la module, which everything else depends on.

Technically a build-breaker fix, but wanted to wait for feedback
on this

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-21 09:13:22 +01:00
Osier Yang
65f61e4594 qemu: Add the new disk src into shared disk table when updating disk
We should record the new disk src in the shared disk table for
updating disk (CD-ROM or Floppy) API. Fortunately, we only allow
to update the disk source now, otherwise we might also want to
set the unpriv_sgio setting.
2013-03-21 12:20:36 +08:00
Paolo Bonzini
1d94891288 domain: add support for iscsi network disks
This plumbs in the XML description of iSCSI shares.  The next patches
will add support for the libiscsi userspace initiator.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-20 17:30:25 -06:00
Li Zhang
a67aebd699 Clean redundant code about VCPU string checking
Now that VCPU number are removed from qemu_monitor_text.c
(commit cc78d7ba), VCPU string checking also should be removed.

Report-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
2013-03-20 16:06:20 -06:00
Gao feng
8d19a9f578 cgroup: export virCgroupRemoveRecursively
We will use virCgroupRemoveRecursively to remove cgroup
directories in the coming patch.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-20 14:21:27 -06:00
Guido Günther
82eec793c7 Don't fail if SELinux is diabled
but libvirt is built with --with-selinux. In this case getpeercon
returns ENOPROTOOPT so don't return an error in that case but simply
don't set seccon.
2013-03-20 21:04:57 +01:00
Daniel P. Berrange
f07f9733cb Fix typos s/HAVE_SELINUX/WITH_SELINUX/
The virNetSocket & virIdentity classes accidentally got some
conditionals using HAVE_SELINUX instead of WITH_SELINUX.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2013-03-20 13:23:40 +00:00
Gao feng
4dceffadc9 LXC: add cpuset cgroup support for lxc
This patch adds cpuset cgroup support for LXC.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-20 19:37:16 +08:00
Gao feng
45e9d27ad8 NUMA: cleanup for numa related codes
Intend to reduce the redundant code,use virNumaSetupMemoryPolicy
to replace virLXCControllerSetupNUMAPolicy and
qemuProcessInitNumaMemoryPolicy.

This patch also moves the numa related codes to the
file virnuma.c and virnuma.h

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-20 19:37:00 +08:00
Olivia Yin
4755e863d1 fix TLS error with virNetServerClientCreateIdentity
Compilation error when WITH_GNUTLS is 0, introduced in commit d5e83ad.
2013-03-19 20:57:08 -06:00
Gao feng
c9759a7b63 LXC: allow uses advisory nodeset from querying numad
Allow lxc using the advisory nodeset from querying numad,
this means if user doesn't specify the numa nodes that
the lxc domain should assign to, libvirt will automatically
bind the lxc domain to the advisory nodeset which queried from
numad.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-19 20:03:29 -06:00
Gao feng
763edb5ebe rename qemuGetNumadAdvice to virNumaGetAutoPlacementAdvice
qemuGetNumadAdvice will be used by LXC driver, rename
it to virNumaGetAutoPlacementAdvice and move it to virnuma.c

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
2013-03-19 15:55:40 -06:00
Olivia Yin
26705e02c1 selinux: deal with dtb file 2013-03-19 15:48:59 -06:00
Olivia Yin
0b3509e245 qemu: add dtb option support
The "dtb" option sets the filename for the device tree.
If without this option support, "-dtb file" will be converted into
<qemu:commandline> in domain XML file.
For example, '-dtb /media/ram/test.dtb' will be converted into
  <qemu:commandline>
    <qemu:arg value='-dtb'/>
    <qemu:arg value='/media/ram/test.dtb'/>
  </qemu:commandline>

This is not very friendly.
This patchset add special <dtb> tag like <kernel> and <initrd>
which is easier for user to write domain XML file.
  <os>
    <type arch='ppc' machine='ppce500v2'>hvm</type>
    <kernel>/media/ram/uImage</kernel>
    <initrd>/media/ram/ramdisk</initrd>
    <dtb>/media/ram/test.dtb</dtb>
    <cmdline>root=/dev/ram rw console=ttyS0,115200</cmdline>
  </os>

Signed-off-by: Eric Blake <eblake@redhat.com>
2013-03-19 15:48:58 -06:00