I noticed when writing the backend functions for virNetworkUpdate that
I was repeating the same sequence of memmove, VIR_REALLOC, nXXX-- (and
messed up the args to memmove at least once), and had seen the same
sequence in a lot of other places, so I decided to write a few
utility functions/macros - see the .h file for full documentation.
The intent is to reduce the number of lines of code, but more
importantly to eliminate the need to check the element size and
element count arithmetic every time we need to do this (I *always*
make at least one mistake.)
VIR_INSERT_ELEMENT: insert one element at an arbitrary index within an
array of objects. The size of each object is determined
automatically by the macro using sizeof(*array). The new element's
contents are copied into the inserted space, then the original copy
of contents are 0'ed out (if everything else was
successful). Compile-time assignment and size compatibility between
the array and the new element is guaranteed (see explanation below
[*])
VIR_INSERT_ELEMENT_COPY: identical to VIR_INSERT_ELEMENT, except that
the original contents of newelem are not cleared to 0 (i.e. a copy
is made).
VIR_APPEND_ELEMENT: This is just a special case of VIR_INSERT_ELEMENT
that "inserts" one past the current last element.
VIR_APPEND_ELEMENT_COPY: identical to VIR_APPEND_ELEMENT, except that
the original contents of newelem are not cleared to 0 (i.e. a copy
is made).
VIR_DELETE_ELEMENT: delete one element at an arbitrary index within an
array of objects. It's assumed that the element being deleted is
already saved elsewhere (or cleared, if that's what is appropriate).
All five of these macros have an _INPLACE variant, which skips the
memory re-allocation of the array, assuming that the caller has
already done it (when inserting) or will do it later (when deleting).
Note that VIR_DELETE_ELEMENT* can return a failure, but only if an
invalid index is given (index + amount to delete is > current array
size), so in most cases you can safely ignore the return (that's why
the helper function virDeleteElementsN isn't declared with
ATTRIBUTE_RETURN_CHECK). A warning is logged if this ever happens,
since it is surely a coding error.
[*] One initial problem with the INSERT and APPEND macros was that,
due to both the array pointer and newelem pointer being cast to void*
when passing to virInsertElementsN(), any chance of type-checking was
lost. If we were going to move in newelem with a memmove anyway, we
would be no worse off for this. However, most current open-coded
insert/append operations use direct struct assignment to move the new
element into place (or just populate the new element directly) - thus
use of the new macros would open a possibility for new usage errors
that didn't exist before (e.g. accidentally sending &newelemptr rather
than newelemptr - I actually did this quite a lot in my test
conversions of existing code).
But thanks to Eric Blake's clever thinking, I was able to modify the
INSERT and APPEND macros so that they *do* check for both assignment
and size compatibility of *ptr (an element in the array) and newelem
(the element being copied into the new position of the array). This is
done via clever use of the C89-guaranteed fact that the sizeof()
operator must have *no* side effects (so an assignment inside sizeof()
is checked for validity, but not actually evaluated), and the fact
that virInsertElementsN has a "# of new elements" argument that we
want to always be 1.
When restarting CPUs after an external snapshot, the restarting function
was called without the appropriate async job type. This caused that a
new sync job wasn't created and allowed races in the monitor.
New VM will have default values for all parameters, like
cpu number, we have to change its configuration as provided
by xml definition, given to parallelsDomainDefineXML.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Implement creation of new disks - if a new disk found
in configuration, find a volume by disk path and
actually create a disk image by issuing prlctl command.
If it's successfully finished - remove the file with volume
definition.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Add function for convertion bus from libvirt's numeric constant
to a name, used in a parallels command-line tools.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Move part, which deletes existing volume, to a new function
parallelsStorageVolumeDefRemove so that we can use it later
in parallels_driver.c
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Read disk images size from xml description and fill
virStorageVolDef.capacity and allocation (let's consider
that allocation is the same as capacity, calculating real
allcoation will be implemented later).
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Disk images in Parallels Cloud Server stored in directories. Each
one has files with data and xml description of an image stored in
file DiskDescriptior.xml.
Since we have to support 'detached' images, which are not used by
any VM, the better way to collect info about volumes is searching for
directories with a file DiskDescriptior.xml in each VM directory.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
There are no storage pools in Parallels Cloud Server -
All VM data stored in a single directory: config, snapshots,
memory dump together with disk images.
Let's look through list of VMs and create a storage pool for
each directory, containing VMs.
So if you have 3 vms: /var/parallels/vm-1.pvm,
/var/parallels/vm-2.pvm and /root/test.pvm - 2 storage pools
appear: -var-parallels and -root. xml descriptions of the pools
will be saved in /etc/libvirt/parallels-storage, so UUIDs will
not change netween connections to libvirt.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We don't support unprivileged users anymore, so remove code, which
selects configuration directory depending on user.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Allow changing some parameters of the hard disks: bus,
image and drive address.
Creating new disk devices and removing existing ones
require changes in the storage driver, so it will be
implemented later.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Offline migration transfers inactive definition of a domain (which may
or may not be active). After successful completion, the domain remains
in its current state on source host and is defined but inactive on
destination host. It's a bit more clever than virDomainGetXMLDesc() on
source host followed by virDomainDefineXML() on destination host, as
offline migration will run pre-migration hook to update the domain XML
on destination host. Currently, copying non-shared storage is not
supported during offline migration.
Offline migration can be requested with a new migration flag called
VIR_MIGRATE_OFFLINE (which has to be combined with
VIR_MIGRATE_PERSIST_DEST flag).
This fixes a problem that showed up during testing of:
https://bugzilla.redhat.com/show_bug.cgi?id=881480
Due to a logic error in the function that gets the name of the bridge
an interface connects to, any time a bridge was specified directly
(type='bridge') rather than indirectly (type='network'), An error
would be logged (although the operation would then complete
successfully):
Network type 6 is not supported
The final virReportError() in the function
qemuDomainNetGetBridgeName() was apparently avoided in the past with a
"goto cleanup" at the end of each case, but the case of bridge somehow
no longer has that final goto cleanup.
The proper solution is anyway to not rely on goto's, but put the error
log inside an else {} clause, so that it's executed only if the type
is neither bridge nor network (in reality, this function should only
ever be called for those two types, that's why this is an internal
error).
While making this change, the error message was also tuned to be more
correct (since it's not really the type of the network, but the type
of the interface, and it *is* otherwise supported, it's just that the
interface type in question doesn't *have* a bridge device associated
with it, or at least we don't know how to get it).
If a network interface model is not specified, libvirt will run
into an unchecked NULL pointer coredump. On the other hand if
the empty model is ignored, a PCI bus address would be generated,
which is not supported by S390.
Since the only valid network type model for S390 is virtio,
we use this as the default value, which is the same for QEMU.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Things are supposed to look like:
<machine canonical='pc-0.12'>pc</machine>
But are currently swapped. This can cause many VMs to revert to having
machine type='pc' which will affect save/restore across qemu upgrades.
Add VIR_STORAGE_VOL_CREATE_PREALLOC_METADATA flag to virStorageVolCreateXML
and virStorageVolCreateXMLFrom. This flag requests metadata
preallocation when creating/cloning qcow2 images, resulting in creating
a sparse file with qcow2 metadata. It has only slightly larger disk usage
compared to new image with no allocation, but offers higher performance.
QEMU supports setting vendor and product strings for disk since
1.2.0 (only scsi-disk, scsi-hd, scsi-cd support it), this patch
exposes it with new XML elements <vendor> and <product> of disk
device.
Based on a patch originally authored by Daniel De Graaf
http://lists.xen.org/archives/html/xen-devel/2012-05/msg00565.html
This patch converts the Xen libxl driver to support only Xen >= 4.2.
Support for Xen 4.1 libxl is dropped since that version of libxl is
designated 'technology preview' only and is incompatible with Xen 4.2
libxl. Additionally, the default toolstack in Xen 4.1 is still xend,
for which libvirt has a stable, functional driver.
virGetGroupIDByName is documented as returning 1 if the groupname
cannot be found. getgrnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name
or gid was not found. »
and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, hence does not specify what
value errno might have in this situation. But that makes it impossible to
recognize errors. One might argue that according to POSIX errno should be
left unchanged if an entry is not found. Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetGroupIDByName returns an error when the return value of getgrnam_r
is non-0. However on my RHEL system, getgrnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetGroupID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an gid).
This commit makes virGetGroupIDByName only report an error when errno
is set to one of the values in the posix description of getgrnam_r
(which are the same as the ones described in the manpage on my system).
virGetUserIDByName is documented as returning 1 if the username
cannot be found. getpwnam_r is documented as returning:
« 0 or ENOENT or ESRCH or EBADF or EPERM or ... The given name
or uid was not found. »
and that:
« The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, hence does not specify what
value errno might have in this situation. But that makes it impossible to
recognize errors. One might argue that according to POSIX errno should be
left unchanged if an entry is not found. Experiments on various UNIX-like
systems shows that lots of different values occur in this situation: 0,
ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM and probably others. »
virGetUserIDByName returns an error when the return value of getpwnam_r
is non-0. However on my RHEL system, getpwnam_r returns ENOENT when the
requested user cannot be found, which then causes virGetUserID not
to behave as documented (it returns an error instead of falling back
to parsing the passed-in value as an uid).
This commit makes virGetUserIDByName only report an error when errno
is set to one of the values in the posix description of getpwnam_r
(which are the same as the ones described in the manpage on my system).
If debugging is enabled, the debug messages are sent to stderr.
Moreover, if a command has catching of stderr set, the messages
gets mixed with stdout output (assuming both outputs are stored
in the same variable). The resulting string then doesn't
necessarily have to start with desired prefix then. This bug
exposes itself when parsing dnsmasq output:
2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06
11:11:02.232+0000: 18492: debug : virFileClose:72 : Closed fd 22'
We can clearly see that the output of dnsmasq --version doesn't
start with expected "Dnsmasq version " string but a libvirt debug
output.
If the debugging is enabled, the virCommand subsystem catches debug
messages in the command output as well. In that case, we can't assume
the string corresponding to command's stdout will start with specific
prefix. But the prefix can be moved deeper in the string. This bug
shows itself when parsing dnsmasq output:
2012-12-06 11:18:11.445+0000: 18491: error :
dnsmasqCapsSetFromBuffer:664 : internal error cannot parse
/usr/sbin/dnsmasq version number in '2012-12-06 11:11:02.232+0000:
18492: debug : virFileClose:72 : Closed fd 22'
We can clearly see that the output of dnsmasq --version
doesn't start with expected "Dnsmasq version " string but a libvirt
debug output.
This resolves: https://bugzilla.redhat.com/show_bug.cgi?id=767057
It was possible to define a network with <forward mode='bridge'> that
had both a bridge device and a forward device defined. These two are
mutually exclusive by definition (if you are using a bridge device,
then this is a host bridge, and if you have a forward dev defined,
this is using macvtap). It was also possible to put <ip>, <dns>, and
<domain> elements in this definition, although those aren't supported
by the current driver (although it's conceivable that some other
driver might support that).
The items that are invalid by definition, are now checked in the XML
parser (since they will definitely *always* be wrong), and the others
are checked in networkValidate() in the network driver (since, as
mentioned, it's possible that some other network driver, or even this
one, could some day support setting those).
This patch adds the capability for virtual guests to do IPv6
communication via a virtual network interface with no IPv6 (gateway)
addresses specified. This capability has always been enabled by
default for IPv4, but disabled for IPv6 for security concerns, and
because it requires the ip6tables command to be operational (which
isn't the case on a system with the ipv6 module completely disabled).
This patch adds a new attribute "ipv6" at the toplevel of a <network>
object. If ipv6='yes', the extra ip6tables rules required to permite
inter-guest communications are added when the network is started. If
it is 'no', or not present, those rules will not be added; thus the
default behavior doesn't change, so there should be no compatibility
issues with any existing installations.
Note that virtual guests cannot communication with the virtualization
host via this interface, because the following kernel tunable has
been set:
net.ipv6.conf.<bridge_interface_name>.disable_ipv6 = 1
This assures that the bridge interface will not have an IPv6
link-local (fe80::) address.
To control this behavior so that it is not enabled by default, the parameter
ipv6='yes' on the <network> statement has been added.
Documentation related to this patch has been updated.
The network schema has also been updated.
https://bugzilla.redhat.com/show_bug.cgi?id=832302
It's odd to fall through to buildVol, and the existed file is
removed when buildVol fails. This checks if the volume target
path already exists in createVol. The reason for not using
error like "Volume already exists" is that there isn't volume
maintained by libvirt for the path until a operation like
pool-refresh, using error like that will just cause confusion.
https://bugzilla.redhat.com/show_bug.cgi?id=866524
Since the virConnect object is not locked wholely when doing
virConenctDispose, a thread can get the lock and thus might
cause the race.
Detected by valgrind:
==23687== Invalid read of size 4
==23687== at 0x38BAA091EC: pthread_mutex_lock (pthread_mutex_lock.c:61)
==23687== by 0x3FBA919E36: remoteClientCloseFunc (remote_driver.c:337)
==23687== by 0x3FBA936BF2: virNetClientCloseLocked (virnetclient.c:688)
==23687== by 0x3FBA9390D8: virNetClientIncomingEvent (virnetclient.c:1859)
==23687== by 0x3FBA851AAE: virEventPollRunOnce (event_poll.c:485)
==23687== by 0x3FBA850846: virEventRunDefaultImpl (event.c:247)
==23687== by 0x40CD61: vshEventLoop (virsh.c:2128)
==23687== by 0x3FBA8626F8: virThreadHelper (threads-pthread.c:161)
==23687== by 0x38BAA077F0: start_thread (pthread_create.c:301)
==23687== by 0x33F68E570C: clone (clone.S:115)
==23687== Address 0x4ca94e0 is 144 bytes inside a block of size 312 free'd
==23687== at 0x4A0595D: free (vg_replace_malloc.c:366)
==23687== by 0x3FBA8588B8: virFree (memory.c:309)
==23687== by 0x3FBA86AAFC: virObjectUnref (virobject.c:145)
==23687== by 0x3FBA8EA767: virConnectClose (libvirt.c:1458)
==23687== by 0x40C8B8: vshDeinit (virsh.c:2584)
==23687== by 0x41071E: main (virsh.c:3022)
The above race is caused by the eventLoop thread tries to handle
the net client event by calling the callback set by:
virNetClientSetCloseCallback(priv->client,
remoteClientCloseFunc,
conn, NULL);
I.E. remoteClientCloseFunc, which lock/unlock the virConnect object.
This patch is to fix the bug by setting the callback to NULL when
doRemoteClose.
The pciWrite32 function assembled the array of data to be written to the
fd with a bad offset on the last byte. This issue was probably caused by
a typo (14, 24).
Unmanaged PCI devices were only leaked if pciDeviceListAdd failed but
managed devices were always leaked. And leaking PCI device is likely to
leave PCI config file descriptor open. This patch fixes
qemuReattachPciDevice to either free the PCI device or add it to the
inactivePciHostdevs list.
An attempt to attach device that is already attached to a domain results
in the following error:
virsh # attach-device rhel6 pci2 --persistent
error: Failed to attach device from pci2
error: invalid argument: device is already in the domain configuration
The "invalid argument" error code looks wrong, we usually use "operation
invalid" when the action cannot be done in current state.
Only one error in qemu_monitor was already using the relatively
new OPERATION_UNSUPPORTED error, even though it is a better fit
for all of the messages related to options that are unsupported
due to the version of qemu in use rather than due to a user's
XML or .conf file choice. Suggested by Osier Yang.
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle)
(qemuMonitorAddHostNetwork, qemuMonitorRemoveHostNetwork)
(qemuMonitorAttachDrive, qemuMonitorDiskSnapshot)
(qemuMonitorDriveMirror, qemuMonitorTransaction)
(qemuMonitorBlockCommit, qemuMonitorDrivePivot)
(qemuMonitorBlockJob, qemuMonitorSystemWakeup)
(qemuMonitorGetVersion, qemuMonitorGetMachines)
(qemuMonitorGetCPUDefinitions, qemuMonitorGetCommands)
(qemuMonitorGetEvents, qemuMonitorGetKVMState)
(qemuMonitorGetObjectTypes, qemuMonitorGetObjectProps)
(qemuMonitorGetTargetArch): Use better error category.
Without this patch, attempts to create a disk snapshot when qemu
is too old results in a cryptic message:
virsh # snapshot-create 23 --disk-only
error: operation failed: Failed to take snapshot: unknown command: 'snapshot_blkdev'
Now it reports:
virsh # snapshot-create 23 --disk-only
error: unsupported configuration: live disk snapshot not supported with this QEMU binary
All versions of qemu that support live disk snapshot also support
QMP (basically upstream qemu 1.1 and later, and backports to RHEL 6.2).
* src/qemu/qemu_capabilities.h (QEMU_CAPS_DISK_SNAPSHOT): New
capability.
* src/qemu/qemu_capabilities.c (qemuCaps): Track it.
(qemuCapsProbeQMPCommands): Set it.
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateDiskActive): Use
it.
* src/qemu/qemu_monitor.c (qemuMonitorDiskSnapshot): Simplify.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskSnapshot):
Likewise.
* src/qemu/qemu_monitor_text.h (qemuMonitorTextDiskSnapshot):
Delete.
* src/qemu/qemu_monitor_text.c (qemuMonitorTextDiskSnapshot):
Likewise.
RHEL 6.3 uses dbus-devel-1.2.24, which lacked support for the
DBUS_TYPE_UNIX_FD define (contrast with Fedora 18 using 1.6.8).
But since it is an older dbus, it also lacks support for shutdown
inhibitions as provided by newer systemd.
Compilation failure introduced in commit 31330926.
* src/rpc/virnetserver.c (virNetServerAddShutdownInhibition):
Compile out if dbus is too old.
Implement the domainManagedSave, domainHasManagedSaveImage, and
domainManagedSaveRemove functions in the libvirt legacy xen driver.
domainHasManagedSaveImage check the managedsave image from filesystem
everytime. This is different from qemu and libxl driver. In qemu or
libxl driver, there is a hasManagesSave flag in virDomainObjPtr which
is not used in xen legacy driver. This flag could not add into xen
driver ptr either, because the driver ptr will be released at the end of
every libvirt api call. Meanwhile, AFAIK, xen store all the flags in
xen not in libvirt xen driver. There is no need to add this flag in xen.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
Use the freedesktop inhibition DBus service to prevent host
shutdown or session logout while any VMs are running.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently to deal with auto-shutdown libvirtd must periodically
poll all stateful drivers. Thus sucks because it requires
acquiring both the driver lock and locks on every single virtual
machine. Instead pass in a "inhibit" callback to virStateInitialize
which drivers can invoke whenever they want to inhibit shutdown
due to existance of active VMs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The only important state that should prevent libvirtd shutdown
is from running VMs. Networks, host devices, network filters
and storage pools are all long lived resources that have no
significant in-memory state. They should not block shutdown.
The patch adds the backend driver to support iSCSI format storage pools
and volumes for ESX host. The mapping of ESX iSCSI specifics to Libvirt
is as follows:
1. ESX static iSCSI target <------> Libvirt Storage Pools
2. ESX iSCSI LUNs <------> Libvirt Storage Volumes.
The above understanding is based on http://libvirt.org/storage.html.
The operation supported on iSCSI pools includes:
1. List storage pools & volumes.
2. Get XML descriptor operaion on pools & volumes.
3. Lookup operation on pools & volumes by name, UUID and path (if applicable).
iSCSI pools does not support operations such as: Create / remove pools
and volumes.