Commit id '3992ff14' added the prototype for networkGetActualType
with 1 parameter, but added 2 ATTRIBUTE_NONNULL's (assume from a
cut-n-paste), just remove (2).
Commit d77ffb6876 added not only reporting of the PCI header type, but
also parsing of that information. However, because there was no parsing
done for the other sub-PCI capabilities, if there was any other
capability then a valid header type name (like phys_function or
virt_functions) the parsing would fail. This prevented passing node
device XMLs that we generated into our own functions when dealing with,
e.g. with SRIOV cards.
Instead of reworking the whole parsing, just fix this one occurence and
remove a test for it for the time being. Future patches will deal with
the rest.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Starting with commit f8e712fe, if you start a domain that has an
<interface type='hostdev' (or that has <interface type='network'>
where the network is a pool of devices for hostdev assignment), when
you later try to add *another* interface (of any kind) with hotplug,
the function qemuAssignDeviceNetAlias() fails as soon as it sees a
"hostdevN" alias in the list of interfaces), causing the attach to
fail.
This is because (starting with f8e712fe) the device alias names are
assigned during the new function qemuProcessPrepareDomain(), which is
called *before* networkAllocateActualDevice() (which is called from
qemuProcessPrepareHost(), which is called from
qemuProcessLaunch()). Prior to that commit,
networkAllocateActualDevice() was called first.
The problem with this is that the alias for interfaces that are really
a hostdev (<interface type='hostdev'>) is of the form "hostdevN" (just
like other hostdevs), while other interfaces are "netN". But if you
don't know that the interface is going to be a hostdev at the time you
assign the alias name, you can't name it differently. (As far as I've
seen so far, the change in name by itself wouldn't have been a problem
(other than just an outwardly noticeable change in behavior) except
for the abovementioned failure to attach/detach new interfaces.
Rather than take the chance that there may be other not-yet-revealed
problems associated with changing the alias name, this patch changes
the way that aliases are assigned to restore the old behavior.
Old: In the past, assigning an alias to an interface was skipped if it
was seen that the interface was type='hostdev' - we knew that the
hostdev part of the interface was also in the list of hostdevs (that's
part of what happens in networkAllocateActualDevice()) and it would be
assigned when all the other hostdev aliases were assigned.
New: When assigning an alias to an interface, we haven't yet called
networkAllocateActualDevice() to construct the hostdev part of the
interface, so we can't just wait for the loop that creates aliases for
all the hostdevs (there's nothing on that list for this device
yet!). Instead we handle it immediately in the loop creating interface
aliases, by calling the new function networkGetActualType() to
determine if it is going to be hostdev, and if so calling
qemuAssignDeviceHostdevAlias() instead.
Some adjustments have to be made to both
qemuAssignDeviceHostdevAlias() and to qemuAssignDeviceNetAlias() to
accommodate this. In both of them, an error return from
qemuDomainDeviceAliasIndex() is no longer considered an error; instead
it's just ignored (because it almost certainly means that the alias
string for the device was "net" when we expected "hostdev" or vice
versa). in qemuAssignDeviceHostdevAlias() we have to look at all
interface aliases for hostdevN in addition to looking at all hostdev
aliases (this wasn't necessary in the past, because both the interface
entry and the hostdev entry for the device already pointed at the
device info; no longer the case since the hostdev entry hasn't yet
been setup).
Fortunately the buggy behavior hasn't yet been in any official release
of libvirt.
In certain cases, we need to assign a hostdevN-style alias in a case
when we don't have a virDomainHostdevDefPtr (instead we have a
virDomainNetDefPtr). Since qemuAssignDeviceHostdevAlias() doesn't use
anything in the virDomainHostdevDef except the alias string itself
anyway, this patch just changes the arguments to pass a pointer to the
alias pointer instead.
There are times when it's necessary to learn the actual type of a
network connection before any resources have been allocated
(e.g. during qemuProcessPrepareDomain()), but in the past it was
necessary to call networkAllocateActualDevice() in order to have the
actual type filled in.
This new function returns the type of network that *will be* setup
once it actually happens, but without making any changes on the host.
The paths have the domain ID in them. Without cleaning them, they would
contain the same ID even after multiple restarts. That could cause
various problems, e.g. with access.
Add function qemuDomainClearPrivatePaths() for this as a counterpart of
qemuDomainSetPrivatePaths().
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The directory name changed in a89f05ba8d.
This unbreaks launching QEMU/KVM VMs with apparmor enabled. It also adds
the directory for the qemu guest-agent socket which is not known when
parsing the domain XML.
This reverts commit ee4cfb5643.
Since we're still not persisting our bookkeeping lists across
daemon restarts, we might have lost some information
virPCIDeviceReattach() relies on, for example whether the
device needs to be unbound from the stub driver.
As a result, if the daemon has been restarted in the meantime,
the device might end up remaining bound to the stub driver even
after 'virsh nodedev-reattach' or similar has been called, with
no way of giving it back to the host short of messing with
sysfs behind libvirt's back.
Revert back to the previous behavior of always trying to bind
the device to the host driver, regardless of its status when it
was detached, until persistent bookkeeping lists have been
implemented.
Commit 08cc14f moved the conversion of MiB/s to B/s out of the
qemuMonitor APIs, but forgot to adjust the qemuMigrationDriveMirror
caller.
This patch will convert the migrate_speed value from MiB/s to its
mirror_speed equivalent in bytes/s.
Signed-off-by: Rudy Zhang <rudyflyzhang@gmail.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
@flags have a valid modification impact only after calling
virDomainObjUpdateModificationImpact. virDomainObjGetOneDef calls it but
doesn't update them in the caller.
Chunyan sent a nice cleanup patch for libxlDomainDetachNetDevice
https://www.redhat.com/archives/libvir-list/2016-March/msg00926.html
which I incorrectly modified before pushing as commit b5534e53. My
modification caused network devices of type hostdev to no longer
be removed. This patch changes b5534e53 to resemble Chunyan's
original, correct patch.
Chunyan sent a correct patch to fix a resource leak on error in
libxlDomainAttachNetDevice
https://www.redhat.com/archives/libvir-list/2016-March/msg00924.html
I made what was thought to be an improvement and pushed the patch as
commit e6336442. As it turns out, my change broke adding net devices
that are actually hostdevs to the list of nets in virDomainDef. This
patch changes e6336442 to resemble Chunyan's original, correct
patch.
Compilation for xdg-app failed due to a buggy SASL headers present on
the used runtime (org.gnome.Sdk 3.18).
In file included from rpc/virnetsaslcontext.h:24:0,
from rpc/virnetsaslcontext.c:25:
/usr/include/sasl/sasl.h:230:38: error: unknown type name 'size_t'
typedef void *sasl_realloc_t(void *, size_t);
^
/usr/include/sasl/sasl.h:235:5: error: unknown type name 'sasl_realloc_t'
sasl_realloc_t *,
Use the same workaround as commit 1be3dfd did.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
We use _LAST items in enums to mark the last position in given
enum. Now, if and enum is passed to switch(), compiler checks
that all the values from enum occur in 'case' enumeration.
Including _LAST. But coverity spots it's a dead code. And it
really is. So to resolve this, we tend to put a comment just
above 'case ..._LAST' notifying coverity that we know this is a
dead code but we want to have it that way.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Do I really need to explain why?
Well, if read() is interrupted int the middle of reading, we will
never read the rest (even though it's highly unlikely as we are
reading just 8 bytes).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Even though we have the machine locked throughout whole APIs we
are querying/modifying domain internal state. We should grab a
job whilst doing that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have @flags we can support changing perf events just
in active or inactive configuration regardless of the other.
Previously, calling virDomainSetPerfEvents set events in both
active and inactive configuration at once. Even though we allow
users to set perf events that are to be enabled once domain is
started up. The virDomainGetPerfEvents API was flawed too. It
returned just runtime info.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed that these APIs are missing @flags argument. Even
though we don't have a use for them, it's our policy that every
new API must have @flags.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
They recently were extracted to a separate function. They don't belong
together though. Since -numa formatting is pretty compact, move it to
the main function and rename qemuBuildNumaCommandLine to
qemuBuildMemoryDeviceCommandLine.
When starting up a VM libvirtd asks numad to place the VM in case of
automatic nodeset. The nodeset would not be passed to the memory device
formatter and the user would get an error.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1269715
Commit 7068b56c introduced several hyperv features. Not all hyperv
features are supported by old enough kernels and we shouldn't allow to
start a guest if kernel doesn't support any of the hyperv feature.
There is one exception, for backward compatibility we cannot error out
if one of the RELAXED, VAPIC or SPINLOCKS isn't supported, for the same
reason we ignore invtsc, to not break restoring saved domains with older
libvirt.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This check is there to allow restore saved domain with older libvirt
where we included invtsc by default for host-passthrough model. Don't
skip the whole function, but only the part that checks for invtsc.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Remove disabling domain death events from libxlDomainStart error
path. The domain death event is already disabled in libxlDomainCleanup.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
libxlDomainStart allocates and reserves resources that were not
being released in error paths. libxlDomainCleanup already handles
the job of releasing resources, and libxlDomainStart should call
it when encountering a failure.
Change the error handling logic to call libxlDomainCleanup on
failure. This includes acquiring the lease sooner and allowing
it to be released in libxlDomainCleanup on failure, similar to
the way other resources are reclaimed. With the lease now
released in libxlDomainCleanup, the release_dom label can be
renamed to cleanup_dom to better reflect its changed semantics.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Create a bitmap of iothreads that have scheduler info set so that the
transformation algorithm does not have to iterate the empty bitmap many
times. By reusing self-expanding bitmaps the bitmap size does not need
to be pre-calculated.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1264008
In some cases it's impractical to use the regular APIs as the bitmap
size needs to be pre-declared. These new APIs allow to use bitmaps that
self expand.
The new code adds a property to the bitmap to track the allocation of
memory so that VIR_RESIZE_N can be used.
Since commit v1.3.2-119-g1e34a8f which enabled debug-threads in QEMU
qemuGetProcessInfo would fail to parse stats for any thread with a space
in its name.
https://bugzilla.redhat.com/show_bug.cgi?id=1316803
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemu won't ever add those functions directly to QMP. They will be
replaced with 'blockdev-add' and 'blockdev-del' eventually. At this time
there's no need to keep the stubs around.
Additionally the drive_del stub in JSON contained dead code in the
attempt to report errors. (VIR_ERR_OPERATION_UNSUPPORTED was never
reported). Since the text impl does have the same message it is reported
anyways.
This patch adds new xml element, and so we can have the option of
also having perf events enabled immediately at startup.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-6-git-send-email-qiaowei.ren@intel.com
This patch implement a set of interfaces for perf event. Based on
these interfaces, we can implement internal driver API for perf,
and get the results of perf conuter you care about.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-4-git-send-email-qiaowei.ren@intel.com
API agreed on in
https://www.redhat.com/archives/libvir-list/2015-October/msg00872.html
* include/libvirt/libvirt-domain.h (virDomainGetPerfEvents,
virDomainSetPerfEvents): New declarations.
* src/libvirt_public.syms: Export new symbols.
* src/driver-hypervisor.h (virDrvDomainGetPerfEvents,
virDrvDomainSetPerfEvents): New typedefs.
* src/libvirt-domain.c: Implement virDomainGetPerfEvents and
virDomainSetPerfEvents.
Signed-off-by: Qiaowei Ren <qiaowei.ren@intel.com>
Message-id: 1459171833-26416-2-git-send-email-qiaowei.ren@intel.com
If the pool creation thread happens to detect the luns in
the scsi target, the size parameters will be calculated as
part of the refreshPool called from storagePoolCreate().
This means the virStoragePoolFCRefreshThread (commit id
'512b874') waiting to run and "refresh" the pool will
essentially double the allocation and capacity values.
A separate refresh would correct the values.
To avoid this, the FCRefreshThread needs to reinitialize
the pool size values prior to calling virStorageBackendSCSIFindLUs
which eventually calls virStorageBackendSCSINewLun and
updates the size values for each volume found.
After the recent commits the build didn't work for me. Fix it by
using size_t as the callback argument is using and the correct
formatter. The attempted fixup to use %llu as a formatter was wrong.
Commit e6336442 changed the 'out:' label to 'cleanup' in
libxlDomainAttachNetDevice(), but missed a comment referencing
the 'out:' label. Remove it from the comment since it is no
longer accurate anyhow.
This patch adds support for "vpindex", "runtime", "synic",
"stimer", and "vendor_id" features available in qemu 2.5+.
- When Hyper-V "vpindex" is on, guest can use MSR HV_X64_MSR_VP_INDEX
to get virtual processor ID.
- Hyper-V "runtime" enlightement feature allows to use MSR
HV_X64_MSR_VP_RUNTIME to get the time the virtual processor consumes
running guest code, as well as the time the hypervisor spends running
code on behalf of that guest.
- Hyper-V "synic" stands for Synthetic Interrupt Controller, which is
lapic extension controlled via MSRs.
- Hyper-V "stimer" switches on Hyper-V SynIC timers MSR's support.
Guest can setup and use fired by host events (SynIC interrupt and
appropriate timer expiration message) as guest clock events
- Hyper-V "reset" allows guest to reset VM.
- Hyper-V "vendor_id" exposes hypervisor vendor id to guest.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
1. All hyperv features are tristate ones. So make tristate generating part common.
2. Reduce nesting on spinlocks.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
1. All hyperv features are tristate ones. So make tristate parsing code common.
2. Reindent switch statement.
3. Reduce nesting in spinlocks parsing.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
After the patches that added tracking of in-use macvtap names (commit
370608, first appearing in libvirt-1.3.2), if the function to allocate
a new macvtap device came to a device name created outside libvirt, it
would retry the same device name MACVLAN_MAX_ID (8191) times before
finally giving up in failure.
The problem was that virBitmapNextClearBit was always being called
with "0" rather than the value most recently checked (which would
increment each time through the loop), so it would always return the
same id (since we dutifully release that id after failing to create a
new device using it).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1321546
Signed-off-by: Laine Stump <laine@laine.org>
For those VF allocated from a network pool, we need to set its backend
to be VIR_DOMAIN_HOSTDEV_PCI_BACKEND_XEN so that later work can be
correct.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
This reverts commit bb5f2dc91f.
The "if (vol->target.format != VIR_STORAGE_FILE_RAW)" check in the
createVol backend. This check is bogus because virStorageVolDefParseXML()
in conf/storage_conf.c sets target.format only if volOptions in
virStoragePoolTypeInfo has formatFromString set, and that's not the
case the zfs backend.
So the check always fails and breaks volume creation.
This reverts commit 6682d6219d.
The "if (vol->target.format != VIR_STORAGE_FILE_RAW)" check in the
createVol backend. This check is bogus because virStorageVolDefParseXML()
in conf/storage_conf.c sets target.format only if volOptions in
virStoragePoolTypeInfo has formatFromString set, and that's not the
case the logical backend.
So the check always fails and breaks volume creation.
When hostdev parent is network device, should call
libxlDomainDetachNetDevice to detach the device from a higher level.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
When AttachNetDevice failed, should call networkReleaseActualDevice
to release actual device, and if actual device is hostdev, should
remove the hostdev from vm->def->hostdevs.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
networkStartNetwork() and networkShutdownNetwork() were calling the
wrong type-specific function in the case of networks that were
configured for macvtap ("direct") bridge mode - they were instead
calling the functions for a tap+bridge network. Currently none of
these functions does anything (they just return 0) so it hasn't
created any problems, but that could change in the future.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316465
An attempt to simplify the code for the VIR_NETWORK_FORWARD_BRIDGE
case of networkUpdateState in commit b61db335 (first in release
1.2.14) resulted in networks based on macvtap bridge mode being
erroneously marked as inactive any time libvirtd was restarted.
The problem is that the original code had differentiated between a
network using tap devices to connect to an existing host-bridge device
(forward mode of VIR_NETWORK_FORWARD_BRIDGE and a non-NULL
def->bridge), and one using macvtap bridge mode to connect to any
ethernet device (still forward mode VIR_NETWORK_FORWARD_BRIDGE, but
null def->bridge), but the changed code assumed that all networks with
VIR_NETWORK_FORWARD_BRIDGE were tap + host-bridge networks, so a null
def->bridge was interpreted as "inactive".
This patch restores the original code in networkUpdateState
The adminDispatchConnectListServers() function is generated by
our great perl script. However, it has a tiny flaw: if
adminConnectListServers() it calls fails, the control jumps onto
cleanup label where we try to free any list of servers built so
far. However, in the loop @i is unsigned (size_t) while @nresults
is signed (int). Currently, it does no harm because of the check
for @result being non-NULL. But if that ever changes in the
future, this bug will be hard to chase.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If a user specify network type ethernet, then create it via libvirt and run
script if it provided. After this commit user does not need to
run external script to create tap device or add root permissions to qemu
process.
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Instead of forcing the values for the unbind_from_stub, remove_slot
and reprobe properties, look up the actual device and use that when
calling virPCIDeviceReattach().
This ensures the device is restored to its original state after
reattach: for example, if it was not bound to any driver before
detach, it will not be bound forcefully during reattach.
We would be just fine looking up the information in pcidevs most
of the time; however, some corner cases would not be handled
properly, so look up the actual device instead.
After this patch, ownership of virPCIDevice instances is very easy
to keep track of: for each host PCI device, the only instance that
actually matters is the one inside one of the bookkeeping list.
Whenever some operation needs to be performed on a PCI device, the
actual device is looked up first; when this is not the case, a
comment explains the reason.
Unmanaged devices, as the name suggests, are not detached
automatically from the host by libvirt before being attached to a
guest: it's the user's responsability to detach them manually
beforehand. If that preliminary step has not been performed, the
attach operation can't complete successfully.
Instead of relying on the lower layers to error out with cryptic
messages such as
error: Failed to attach device from /tmp/hostdev.xml
error: Path '/dev/vfio/12' is not accessible: No such file or directory
prevent the situation altogether and provide the user with a more
useful error message.
Unmanaged devices are attached to guests in two steps: first,
the device is detached from the host and marked as inactive;
subsequently, it is marked as active and attached to the guest.
If the daemon is restarted between these two operations, we lose
track of the inactive device.
Steps 5 and 6 of virHostdevPreparePCIDevices() already subtly
take care of this situation, but some planned changes will make
it so that's no longer the case. Plus, explicit is always better
than implicit.
This will skip few steps from qemuProcessStart in order to create only
qemu CMD. Use a VIR_QEMU_PROCESS_START_PRETEND for all the qemuProcess*
functions called by this one to not modify or check host.
This new function will be used later on for XMLToNative API and also for
qemuxml2argvtest to make sure that both API and test uses the same code
as qemuProcessStart.
We need also update qemuProcessInit to wrap few lines of code with check
that VIR_QEMU_PROCESS_START_PRETEND that makes sense only for
qemuProcessStart.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move all code that checks host and domain. Do not check host if we use
VIR_QEMU_PROCESS_START_PRETEND flag.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Move all code that modifies only live XML to this function. The new
VIR_QEMU_PROCESS_START_PRETEND flag will be used by qemuXMLToNative and
qemuxml2argvtest later in order to reuse the same code as
qemuProcessStart uses.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The postParse callback is the correct place to generate default values
that should be present in offline XML.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
QEMU changed the error message to:
"Tray of device 'drive-sata0-0-1' is not open"
and they may change the error massage in the future.
This updates the code to not depend on the text from the error message
but only on error itself.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
When reading in an XML definition for a SCSI target device, the name
property of struct scsi_target refers to the @target element.
Let's fix this obvious typo and also extend the XML schema to provide
validation.
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Until now, the libxl driver ignored any <hap> setting in domain XML
and deferred to libxl, which enables hap if not specified. While
this is a good default, it prevents disabling hap if desired.
This change allows disabling hap with <hap state='off'/>. hap is
explicitly enabled with <hap/> or <hap state='on/>. Absense of <hap>
retains current behavior of deferring default state to libxl.
hap is enabled by default in xm and xl config and usually only
specified when it is desirable to disable hap (hap = 0). Change
the xm,xl <-> xml converter to behave similarly. I.e. only
produce 'hap = 0' when <hap state='off'/> and vice versa.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Most hypervisors use Hardware Assisted Paging by default and don't
require specifying the feature in domain conf. But some hypervisors
support disabling HAP on a per-domain basis. To enable HAP by default
yet provide a knob to disable it, extend the <hap> feature with a
'state=on|off' attribute, similar to <pvspinlock> and <vmport> features.
In the absence of <hap>, the hypervisor default (on) is used. <hap>
without the state attribute would be the same as <hap state='on'/> for
backwards compatibility. And of course <hap state='off'/> disables hap.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The function already takes two bool arguments, switching to flags makes
it a lot easier to read. Especially in case we need to add another
boolean in the future.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
In post-copy mode none of the hosts has a complete guest state and
rolling back migration is impossible. Thus aborting it would be
equivalent to destroying the domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When migration fails in the post-copy mode, it's impossible to just kill
the destination domain and resume the source since the source no longer
contains current guest state. Let's mark domains on both sides as
VIR_DOMAIN_PAUSED_POSTCOPY_FAILED to let the upper layer decide what to
do with them.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When destination libvirtd is restarted during migration in Finish phase
just after the point we started guest CPUs, we should not kill the
domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Migration enters "postcopy-active" state after QEMU switches to
post-copy and pauses guest CPUs. From libvirt's point of view this state
is similar to "completed" because we need to transfer guest execution to
the destination host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
To use post-copy one has to start the migration with
VIR_MIGRATE_POSTCOPY flag and, while migration is in progress, call
virDomainMigrateStartPostCopy() to switch from pre-copy to post-copy.
Signed-off-by: Cristian Klein <cristiklein@gmail.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY and VIR_DOMAIN_PAUSED_POSTCOPY are
used on the source host once migration enters post-copy mode (which
means the domain gets paused on the source. After the destination host
takes over the execution of the domain, its virtual CPUs are resumed and
the domain enters VIR_DOMAIN_RUNNING_POSTCOPY state and
VIR_DOMAIN_EVENT_RESUMED_POSTCOPY event is emitted.
In case migration fails during post-copy mode and none of the hosts have
complete state of the domain, both domains will remain paused with
VIR_DOMAIN_PAUSED_POSTCOPY_FAILED reason and an upper layer may decide
what to do.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This allows setting the address in host and/or network order and makes
the naming consistent. Now you don't need to call [hn]to[nh]l()
functions as that is taken care of by these functions. Also, now
the *NetOrder take the address in network order, the other functions in
host order so the naming and usage is consistent. Some places were
having the address in network order and calling ntohl() just so the
original function can call htonl() again. This makes it nicer to read.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If a <graphics type='spice'> has no port nor tlsPort set, the generated
QEMU command line will contain -spice port=0.
This is later going to be ignored by spice-server, but it's better not
to add it at all in this situation.
As an empty -spice is not allowed, we still need to append port=0 if we
did not add any other argument.
The end goal is to avoid adding -spice port=0,addr=127.0.0.1 to QEMU command
line when no SPICE port is specified in libvirt XML.
Currently, the code relies on port=xx to always be present, so subsequent
args can be unconditionally appended with a leading ','. Since port=0
will no longer be added in a subsequent commit, we append a ',' to every
arg instead of prepending, and remove the last one before adding it to
the arg list.
It's just a combination of AddImplicitControllers, and AddConsoleCompat.
Every caller that wants ImplicitControllers also wants the ConsoleCompat
AFAICT, so lump them together. We also need it for future patches.
Judging by how the whitelist has skewed quite far from the original
error message, I think it's better to just drop these.
If someone wants to revive this check I suggest implementing it on
a per-HV driver basis with PostParse callbacks.
If we expose this information, which is one byte in every PCI config
file, we let all mgmt apps know whether the device itself is an endpoint
or not so it's easier for them to decide whether such device can be
passed through into a VM (endpoint) or not (*-bridge).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1317531
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Best viewed with '-w' as this is just an adjustment for future patch to
be readable without that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The implementation is pretty straightforward. Moreover, because
of the nature of things, gethostbyname_r and gethostbyname2_r can
be implemented at the same time too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a missing counterpart for virSocketAddrSetIPv4Addr()
and is going to be needed later in the tests.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is going to be used later in such context where the
argument makes no sense. Teach this function to cope with that
instead of the caller having to deal with passing some dummy
argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
These functions are going to be reused very shortly. So instead
of duplicating the code, lets move them into utils module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '4f846170' added printing of a new field 'part_separator';
however, neglected to do so when there was an "freeExtent" defined
for the device (as there would be when the disk pool was started).
This patch adjusts the logic to appropriately format the device path and
if there the part_separator attribute.
Commit 'ef2ab8fd' moved just the virDomainConfNWFilterTeardown and left
the logic to save/restore the current error essentially doing nothing
in the error path for qemuBuildCommandLine. So move it to where it
was meant to be.
Although the original code would reset the filter on command creation
errors after building the network command portion and commit 'ef2ab8fd'
altered that logic, the teardown is called during qemuProcessStop from
virDomainConfVMNWFilterTeardown and that code has the save/restore
last error logic, so just allow that code to handle the teardown rather
than running it twice. The qemuProcessStop would be called in the failure
path of qemuBuildCommandLine.
We include the file in plenty of places. This is mostly due to
historical reasons. The only place that needs something from the
header file is storage_backend_fs which opens _PATH_MOUNTED. But
it gets the file included indirectly via mntent.h. At no other
place in our code we need _PATH_.*. Drop the include and
configure check then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Missing modules is a common expected scenario for most libvirt usage on
RPM distributions like Fedora, so it doesn't really warrant logging at
WARN level. Use INFO instead
https://bugzilla.redhat.com/show_bug.cgi?id=1274849
It does not have a suffix ByName because there are no other means of
looking up the server and since the name is known, this should be the
preferred one.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Mostly it is just passing new parameter here and there. In case
of zero value we fallback to auto selecting port and thus keep
backward compatibility.
Also we need to fix places of auto selected port managment.
We should bother only when auto selected was done that is
when externally specified port is not 0.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1271183
We only wait 0.5 seconds for the session daemon to start up and present
its socket, which isn't sufficient for many users. Bump up the sleep
interval and retry amount so we wait for a total of 5.0 seconds.
Refactor series 0b231195 worked with virLogDestination type which, depending
on the compiler, might be (and probably will be) an unsigned data type.
However, virEnumFromString may return -1 in case of an error. So, when enum
happens to be unsigned, some compilers will naturally complain about foo:
'if (foo < 0)'
Each version of virtuozzo supports only one type of SCSI controller
So if we add disk on SCSI bus, we should set SCSI controller model.
We can take it from vzCapabilities structure.
Because Vz6 supports SCSI(BUSLOGIC), IDE and SATA controllers only and
Vz7 supports SCSI(VIRTIO_SCSI) and IDE only we add list of supported
controllers and scsi models to vzCapabilities structure.
When a new connection opens, we select proper capabilities values according
to Virtuozzo version and check them in XMLPostParse.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We should report correct disk format depending on vz version and domain type.
Since we support only one disk format for each domain type, we can take it
from vzCapabilities structure.
As long as we have another function checking disk parameters correctness,
let's have them in one place. Here we change prefix of the moved function and
start to call it from vzCheckUnsupportedDisks rather than add disk.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
As far as Virtuozzo6 and Virtuozzo7 support different disk types for virtual
machines (ploop and qcow2 respectively) and different buses (vz6: IDE, SCSI,
SATA; vz7: IDE SCSI) we add vzCapabilities structure to help undestand which
disk formats and buses are supported in the context of a current connection.
When a new connection opens, we select proper capabilities in accordance to
current Virtuozzo version.
The problem with the original virLogParseOutputs method was that the way it
parsed the input, walking the string char by char and using absolute jumps
depending on the virLogDestination type, was rather complicated to read.
This patch utilizes virStringSplit method twice, first time to filter out any
spaces and split the input to individual log outputs and then for each
individual output to tokenize it by to the parts according to our
PRIORITY:DESTINATION?(:DATA) format. Also, to STREQLEN for matching destination
was replaced with virDestinationTypeFromString call.
In order to refactor the ugly virLogParseOutputs method, this is a neat way of
finding out whether the destination type (in the form of a string) user
provided is a valid one. As a bonus, if it turns out it is valid, we get the
actual enum which will later be passed to any of virLogAddOutput methods right
away.
Just a cleanup I stumbled upon in one of my older branches I did when
browsing through some code and forgot to send it.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In qemuConnectDomainXMLToNative() we set up the monitor, but we never
memset() it to zeros. Thanks to the introduction of the logfile
parameter of chardevs (and the logfile member of the struct), we started
checking whether that's non-NULL and that exposed this old error.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reverting to a snapshot may change domain configuration. New
configuration should be saved if domain has persistent flag.
VIR_DOMAIN_EVENT_DEFINED_FROM_SNAPSHOT is emitted in case of
configuration update.
Add new function to manage adding the panic device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the NVRAM device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the RNG device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also modify the qemuBuildRNGDevStr to use const virDomainDef instead
of virDomainDefPtr.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the memballoon device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also modify the qemuBuildMemballoonDevStr to use const virDomainDef
instead of virDomainDefPtr.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the host device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also modify qemuBuildPCIHostdevDevStr, qemuBuildUSBHostdevDevStr,
and qemuBuildSCSIHostdevDevStr to use const virDomainDef instead
of virDomainDefPtr.
Make qemuBuildPCIHostdevPCIDevStr and qemuBuildUSBHostdevUSBDevStr
static to the qemu_command.c.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the redirdev device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also move the qemuBuildRedirdevDevStr closer to the new function and
modify to use the const virDomainDef instead of virDomainDefPtr
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the watchdog device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also since qemuBuildWatchdogDevStr was only local here, make it static as
well as modifying the const virDomainDef.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the sound device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also since qemuBuildSoundDevStr was only local here, make it static as
well as modifying the const virDomainDef.
Signed-off-by: John Ferlan <jferlan@redhat.com>
These comments explain the difference between a virPCIDevice
instance used for lookups and an actual device instance; some
information is also provided for specific uses.
virHostdevGetPCIHostDeviceList() is similar but does not filter out
devices that are not in the active list; that said, we are looking
up the device in the active list just a few lines after anyway, so
we might as well just keep a single function around.
This also helps stress the fact the objects contained in pcidevs are
only for looking up the actual devices, which is something later
commits will make even more explicit.
We're in the hostdev module, so mgr is not an ambiguous name, and
in fact it's already used in some cases. Switch all the code over.
Take the chance to shorten declaration of
virHostdevIsPCINodeDeviceUsedData structures.
When we want to look up a device in a device list and we already
have the IDs from another source, we can simply use
virPCIDeviceListFindByIDs() instead of creating a temporary device
object.
If 'last_processed_hostdev_vf != -1' is false then, since the
loop counter 'i' starts at 0, 'i <= last_processed_hostdev_vf'
can't possibly be true and the loop body will never be executed.
However, since 'i' is unsigned and 'last_processed_hostdev_vf'
is signed, we can't just get rid of the check completely; what
we can do is move it outside of the loop to avoid checking its
value on every iteration and cluttering the actual loop
condition.
Since commit 9c14a9ab we have broken active domain listing
because reworked prlsdkLoadDomain doesn't set dom->def->id
propely. It just looses it when a new def structure is set.
Now we make prlsdkConvertDomainState function return void
and move calling it after an old dom->def is replaces with
a new one within prlsdkLoadDomain function.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
This function can be called over a domain definition that has no
video configured. The
tests/qemuxml2argvdata/qemuxml2argv-minimal.xml file could serve
as an example. Problem is, before the check that domain has some
or none video configured, def->videos is dereferenced causing a
segmentation fault in case there's none video configured.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add new function to manage adding the video device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the input device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Make qemuBuildUSBInputDevStr static since only this module calls it.
Also the change to use const virDomainDef forces other changes.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Modify the argument order and types to match other similar helpers.
Also modify called functions to use the def->emulator instead of passing
def->emulator and def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the console device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the channel device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the parallels device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Alter logic slight to reduce indention level.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the serial device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Using const virDomainDef causes collateral damage in other called APIs
which need to make the similar adjustment
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the smartcard device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Alter the logic slightly to make !nsmartcards check first so that remainder
of the code is less indented.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Let's call it modern_ret_as_list as opposed to single_ret_as_list. The
latter was able to return list of things. However the new, more modern,
version came and it is used since listAllDomains till nowadays in
ListServers.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We were using parentheses for grouping admin|remote even though we didn't
need to capture what's in it. That caused some changes to be greater
than needed and, to be honest, some confusion as well. Let's use it as
it should be used. It'll also make future changes more consistent.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
For now it does not matter which ones we return as the code is similarly
complex, however it will fit in with other constructs in the future,
mainly when we will be able to generate dispatch helpers.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
virHashForEach() returns 0 if everything went nice, so our session
daemon was timing out even when there was a client connected.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1315606
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This serves the same purpose as VIR_ERR_NO_xxx where xxx is any object
that API can be called upon. Only this particular one is used for
daemon's servers.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since servers know their name, there is no need to supply such
information twice. Also defeats inconsistencies.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
At first I did not want to do this, but after trying to implement some
newer feaures in the admin API I realized we need that to make our lives
easier. On the other hand they are not saved redundantly and the
virNetServer objects are still kept in a hash table.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Function virAdmConnectListServers() forgot to check for flags at all,
virAdmConnectOpen() on the other hand checked them but did no dispatch
the error. virCheckFlags() should be used only when there should be no
other thing done after erroring out and since they are used on different
places then just public API, they cannot dispatch errors. So let's use
virCheckFlagsGoto instead.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Add new function to manage adding the network device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the -fsdev options to the
command line removing that task from the mainline qemuBuildCommandLine.
Alter the code slightly to perform the !caps and fsdev failure check
up front.
Since both qemuBuildFSStr and qemuBuildFSDevStr are local, make them
static and fix their prototypes to use the const virDomainDef as well.
Make some minor formatting changes for long lines.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the disk -drive options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also since using const virDomainDef in new function, that means other
functions called needed to change their usage.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the hub -device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also make qemuBuildHubDevStr static to the module since it's only
used here.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the controller -device options to the
command line removing that task from the mainline qemuBuildCommandLine.
Also adjust to using const virDomainDef instead of virDomainDefPtr.
This causes collateral damage in order to modify called APIs to use
the const virDomainDef instead as well.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the -global controller options to
the command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the -boot options to the command
line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the power management options to the
command line removing that task from the mainline qemuBuildCommandLine.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-clock' options to the command
line removing that task from the mainline qemuBuildCommandLine.
Also includes some minor formatting cleanups.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When debug-threads is enabled, individual threads are given a separate
name (on Linux)
Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1140121
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
QEMU (somewhere around 2.0) added a new sub-option to the -name flag
-name debug-threads=on.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently the file based character devices let QEMU write
directly to a file on disk. This allows a malicious QEMU
to inflict a denial of service by consuming all free space.
Switch QEMU to use a pipe to virtlogd, which will enforce
file rollover.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If use of virtlogd is enabled, then use it for backing the
character device log files too. This avoids the possibility
of a guest denial of service by writing too much data to
the log file.
The virtlogd daemon currently opens all files for append, but
in some cases the user may wish to discard existing data. Define
a new flag to indicate that logfiles should be truncated when
opening.
The functions for handling FD passing when building command line
arguments need to be used by many different bits of code, so need
to be at the start of the source file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The act of formatting a chardev backend value may need to
append command line arguments for passing FDs. If we append
the -chardev arg before formatting the value, then the
resulting arguments will end up interspersed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Honour the <log file='...'/> element in chardevs to output
data to a file. This requires QEMU >= 2.6
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Extend the chardev source XML so that there is a new optional
<log/> element, which is applicable to all character device
backend types. For example, to log output of a TCP backed
serial port
<serial type='tcp'>
<source mode='connect' host='127.0.0.1' service='9999'/>
<protocol type='raw'/>
<log file='/var/log/libvirt/qemu/demo-serial0.log' append='on'/>
<target port='0'/>
</serial>
Not all hypervisors will support use of logfiles.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Not all callers of virLogManagerDomainOpenLogFile will
care about getting the current inode/offset, so we should
allow those parameters to be NULL
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Now that the function was extracted we can get rid of some temp
variables. Additionally formatting of the bitmap string for the event
code should be checked.
Allow pinning for inactive vcpus. The pinning mask will be automatically
applied as we would apply the default mask in case of a cpu hotplug.
Setting the scheduler settings for a vcpu has the same semantics.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1306556
Introduce VIR_DOMAIN_DEF_FEATURE_OFFLINE_VCPUPIN domain feature flag
whcih will allow to skip ignoring of the pinning information for
hypervisor drivers which will want to implement forward-pinning of
vcpus.
Introduce a helper to check supported device and domain config and move
the memory hotplug checks to it.
The advantage of this approach is that by default all new features are
considered unsupported by all hypervisors unless specifically changed
rather than the previous approach where every hypervisor would need to
declare that a given feature is unsupported.
To avoid having to forbid new features added to domain XML in post parse
callbacks for individual hypervisor drivers the feature flag mechanism
will allow to add a central check that will be disabled for the drivers
that will add support.
As a first example flag, the 'hasWideSCSIBus' is converted to the new
bitmap.
The VIR_DOMAIN_EVENT_ID_JOB_COMPLETED event will be triggered once a job
(such as migration) finishes and it will contain statistics for the job
as one would get by calling virDomainGetJobStats. Thanks to this event
it is now possible to get statistics of a completed migration of a
transient domain on the source host.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We would happily report and free statistics of a completed migration
even before it actually completed (on the source host while migration is
in the Finish phase).
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Computing a total downtime during a migration requires us to store a
time stamp when guest CPUs get stopped. The value (and all other
statistics) is then transferred to the destination to compute the
downtime. Because the stopped time stamp is stored by a STOP event
handler while the statistics which will be sent over to the destination
are copied synchronously within qemuMigrationWaitForCompletion.
Depending on the timing of STOP and MIGRATION events, we may end up
copying (and transferring) statistics without the stopped time stamp
set. Let's make sure we always use the correct time stamp.
https://bugzilla.redhat.com/show_bug.cgi?id=1282744
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
With a very old QEMU which doesn't support events we need to explicitly
call qemuMigrationSetOffline at the end of migration to update our
internal state. On the other hand, if we talk to QEMU using QMP, we
should just wait for the STOP event and let the event handler update the
state and trigger a libvirt event.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We should not overwrite all migration statistics on the source with the
numbers sent by the destination since the source may have an updated
view in some cases (such as post-copy migration). It's safer to update
just the timing info we need to get from the destination and be prepared
for the future. And we should only do all this after a successful
migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Statistics for a completed migration only make sense if the migration
was successful. Let's not store them in priv->job.completed until we
are sure it was a success.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The comment claimed that virPCIDeviceReattach() does not reattach
a device to the host driver; except it actually does, so the
comment is just confusing and we're better off removing it.
Replace the term "loop" with the more generic "step". This allows us
to be more flexible and eg. have a step that consists in a single
function call.
Don't include the number of steps in the first comment of the
function, so that we can add or remove steps without having to worry
about keeping that comment in sync.
For the same reason, remove the summary contained in that comment.
Clean up some weird vertical spacing while we're at it.
If the stars are in the right position and you're building with
VBox >= 4.2.0 it will happen that compiler thinks an array
allocated on the stack may be unbounded:
In file included from vbox/vbox_V4_2.c:13:0:
vbox/vbox_tmpl.c: In function '_virtualboxCreateMachine':
vbox/vbox_tmpl.c:2811:1: error: stack usage might be unbounded [-Werror=stack-usage=]
_virtualboxCreateMachine(vboxGlobalData *data, virDomainDefPtr def, IMachine **machine, char *uuidstr ATTRIBUTE_UNUSED)
^
Well, given how the variable is declared, I had some hard time
seeing it is actually bounded. Surprisingly compiler does not
complain because of -Wframe-larger-than. This is because
variable length arrays do not count into that warning.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The code does not handle renaming of the save state file. In addition to
that the resuming code would need to be tweaked to handle the name
change since the XML is extracted from the save image. The easies option
is to make the rename API even less useful by forbiding this.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1314594
This is an error message I've just seen. Fix it by initializing
@inode.
CC lxc/libvirt_driver_lxc_impl_la-lxc_process.lo
lxc/lxc_process.c: In function 'virLXCProcessMonitorInitNotify':
lxc/lxc_process.c:767:23: error: 'inode' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainAuditInit(vm, initpid, inode);
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Original current flag expansion does not filter out non
_CONFIG and _LIVE flags explicitly but they are prohibited
earlier by virCheckFlags.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Flag expansion is the same as in virDomainObjUpdateModificationImpact
which virDomainLiveConfigHelperMethod calls internally. The difference
is merely in implementation. Note that VIR_DOMAIN_MEM_CONFIG is the
same as VIR_DOMAIN_AFFECT_CONFIG. Additionally, the called functions
will properly use flag OR and thus handle the VIR_DOMAIN_MEM_MAXIMUM case.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
While trying to build with -Os couple of compile errors showed
up.
conf/domain_conf.c: In function 'virDomainChrRemove':
conf/domain_conf.c:13666:24: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virDomainChrDefPtr ret, **arrPtr = NULL;
^
Compiler fails to see that @ret is used only if set in the loop,
but whatever, there's no harm in initializing the variable.
In vboxAttachDrivesNew and _vboxAttachDrivesOld compiler thinks
that @rc may be used uninitialized. Well, not directly, but maybe
after some optimization. Yet again, no harm in initializing a
variable.
In file included from ./util/virthread.h:26:0,
from ./datatypes.h:28,
from vbox/vbox_tmpl.c:43,
from vbox/vbox_V3_1.c:37:
vbox/vbox_tmpl.c: In function '_vboxAttachDrivesOld':
./util/virerror.h:181:5: error: 'rc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
virReportErrorHelper(VIR_FROM_THIS, code, __FILE__, \
^
In file included from vbox/vbox_V3_1.c:37:0:
vbox/vbox_tmpl.c:1041:14: note: 'rc' was declared here
nsresult rc;
^
Yet again, one uninitialized variable:
qemu/qemu_driver.c: In function 'qemuDomainBlockCommit':
qemu/qemu_driver.c:17194:9: error: 'baseSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
qemuDomainPrepareDiskChainElement(driver, vm, baseSource,
^
And another one:
storage/storage_backend_logical.c: In function 'virStorageBackendLogicalMatchPoolSource.isra.2':
storage/storage_backend_logical.c:618:33: error: 'thisSource' may be used uninitialized in this function [-Werror=maybe-uninitialized]
thisSource->devices[j].path))
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While trying to build with -Os I've encountered some build
failures.
util/vircommand.c: In function 'virCommandAddEnvFormat':
util/vircommand.c:1257:1: error: inlining failed in call to 'virCommandAddEnv': call is unlikely and code size would grow [-Werror=inline]
virCommandAddEnv(virCommandPtr cmd, char *env)
^
util/vircommand.c:1308:5: error: called from here [-Werror=inline]
virCommandAddEnv(cmd, env);
^
This function is big enough for the compiler to be not inlined.
This is the error message I'm seeing:
Then virDomainNumatuneNodeSpecified is exported and called from
other places. It shouldn't be inlined then.
In file included from network/bridge_driver_platform.h:30:0,
from network/bridge_driver_platform.c:26:
network/bridge_driver_linux.c: In function 'networkRemoveRoutingFirewallRules':
./conf/network_conf.h:350:1: error: inlining failed in call to 'virNetworkDefForwardIf.constprop': call is unlikely and code size would grow [-Werror=inline]
virNetworkDefForwardIf(const virNetworkDef *def, size_t n)
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
More fallout from changing to using virPolkitAgent and handling error
paths. Needed to clear the 'cmd' once stored and of course add the
virCommandFree(cmd) in the error: label.
Older compilers fail to see that 'close' is not used a function
rather than a variable and produce the following error:
cc1: warnings being treated as errors
../../src/datatypes.c: In function 'virConnectCloseCallbackDataReset':
../../src/datatypes.c:149: error: declaration of 'close' shadows a global declaration [-Wshadow]
Replace all the 'close' occurrences with 'closeData' to resolve
this.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In virPolkitAgentCreate neglected to initialize agent to NULL. If
there was an error in the pipe, then we jump to error and would have
an issue. Found by coverity.
libxlDomainPinVcpuFlags calls virDomainLiveConfigHelperMethod which will
call virDomainObjUpdateModificationImpact make the same AFFECT_LIVE flags
and !active check, so remove this duplicated check.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Prior to commit id '3d021381' virDomainObjUpdateModificationImpact was
part of virDomainLiveConfigHelperMethod and the *flags if condition
VIR_DOMAIN_AFFECT_CONFIG checked the ->persistent boolean and made the
virDomainObjGetPersistentDef call.
Since the functions were split the ->persistent check is all that remained
and thus could be combined into one if statement.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
When SPICE graphics is configured for a domain but we did not ask the
client to switch to the destination, we should not wait for
SPICE_MIGRATE_COMPLETED event (which will never come).
https://bugzilla.redhat.com/show_bug.cgi?id=1151723
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Migration statistics are not available on the destination host and
starting a query job during incoming migration is not allowed. Trying to
do that would result in
Timed out during operation: cannot acquire state change lock (held
by remoteDispatchDomainMigratePrepare3Params)
error. We should not even try to start the job.
https://bugzilla.redhat.com/show_bug.cgi?id=1278727
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This parameter represents top level period cgroup
that limits whole domain enforcement period for a quota
Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com>
If connect close is fired then following unregister will fail
as we set callback to NULL and thus callback equality checking
will fail.
Callback is set to NULL to make it fired only one time probabaly.
Instead lets use connection equality to NULL to check if callback
is already fired.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
We have reference to connection object in virConnectCloseCallbackData
object thus we have to refcount it. Obviously we have problems
in dispose and call functions. Let's fix it.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Make register and unregister functions return void because
we can check the state of callback object beforehand via
virConnectCloseCallbackDataGetCallback. This can be done
without race conditions if we use higher level locks for registering
and unregistering. The fact they return void simplifies
task of consistent registering/unregistering.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
qemuProcessSetupEmulator runs at a point in time where there is only
the qemu main thread. Use virCgroupAddTask to put just that one task
into the emulator cgroup. That patch makes virCgroupMoveTask and
virCgroupAddTaskStrController obsolete.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Move qemuProcessSetupEmulator up under qemuSetupCgroup. That way
we move the one main thread right into the emulator cgroup, instead
of moving multiple threads later on. And we do not actually want any
threads running in the parent cgroups (cpu cpuacct cpuset).
Signed-off-by: Henning Schild <henning.schild@siemens.com>
This attribute is used to extend secondary PCI bar and expose it to the
guest as 64bit memory. It works like this: attribute vram is there to
set size of secondary PCI bar and guest sees it as 32bit memory,
attribute vram64 can extend this secondary PCI bar. If both attributes
are used, guest sees two memory bars, both address the same memory, with
the difference that the 32bit bar can address only the first part of the
whole memory.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1260749
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
We always place primary video device at first place, to make it easier
to create a qemu command or format an xml, but we should also set the
primary boolean for primary video device to 'true'.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Introduce virPolkitAgentCreate and virPolkitAgentDestroy
virPolkitAgentCreate will run the polkit pkttyagent image as an asynchronous
command in order to handle the local agent authentication via stdin/stdout.
The code makes use of the pkttyagent --notify-fd mechanism to let it know
when the agent is successfully registered.
virPolkitAgentDestroy will close the command effectively reaping our
child process
When there isn't a ssh -X type session running and a user has not
been added to the libvirt group, attempts to run 'virsh -c qemu:///system'
commands from an otherwise unprivileged user will fail with rather
generic or opaque error message:
"error: authentication failed: no agent is available to authenticate"
This patch will adjust the error code and message to help reflect the
situation that the problem is the requested mechanism is UNAVAILABLE and
a slightly more descriptive error. The result on a failure then becomes:
"error: authentication unavailable: no polkit agent available to
authenticate action 'org.libvirt.unix.manage'"
A bit more history on this - at one time a failure generated the
following type message when running the 'pkcheck' as a subprocess:
"error: authentication failed: polkit\56retains_authorization_after_challenge=1
Authorization requires authentication but no agent is available."
but, a patch was generated to adjust the error message to help provide
more details about what failed. This was pushed as commit id '96a108c99'.
That patch prepended a "polkit: " to the output. It really didn't solve
the problem, but gave a hint.
After some time it was deemed using DBus API calls directly was a
better way to go (since pkcheck calls them anyway). So, commit id
'1b854c76' (more or less) copied the code from remoteDispatchAuthPolkit
and adjusted it. Then commit id 'c7542573' adjusted the remote.c
code to call the new API (virPolkitCheckAuth). Finally, commit id
'308c0c5a' altered the code to call DBus APIs directly. In doing
so, it reverted the failing error message to the generic message
that would have been received from DBus anyway.
This new API will allocate the secret, assign the def pointer, and
insert the secret onto the passed list. Whether that's the temporary
list in loadSecrets which gets loaded into the driver list or driver
list during secretDefineXML.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a temporary helper to search for a specific secret by address
on the list and remove it if it's found. The following patch will
introduce a common allocation and listInsert helper. That means
error paths of the routines calling would need a way to remove the
secret off the list.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This patch removes need for secretBase64Path and secretComputePath. Similar
to the configFile, create an entry for base64File, which will be generated
as the driver->configDir, the UUID value, plus the ".base" suffix. Rather
than generating on the fly, store this in the virSecretObj.
The buildup of the pathname done in loadSecrets where the failure to build
is ignored which is no different than the failure to generate the name
in secretLoadValue which would have been ignored in the failure path
after secretLoad.
This also removes the need for secretComputPath and secretBase64Path.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This patch removes the need for secretXMLPath. Instead save 'path' during
loadSecret as 'configFile'. The secretXMLPath is nothing more than an
open coded virFileBuildPath. All that code did was concantenate the
driver->configDir, the UUID of the secret, and the ".xml" suffix to form
the configFile name which we now will generate and save instead.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The 'secretLoad' was essentially open coding virFileBuildPath.
Adjust the logic to have the caller build the path and pass it. The net
sum of ignoring the virFileBuildPath failure is the same as before where
the failure to virAsprintf the path would have been ignored anyway in
the secretLoad error path.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Remove the need for the local 'secret' in secretConnectListAllSecrets.
A subsequent patch will rename the ObjPtr entry to secret.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than having it interspersed with other changes, do it once.
Remove a couple ^L, 1 argument per line for functions, less than 80 chars
per line, use of spacing between logical groups of code, use of one line
if statements when doing fetch followed by comparison, use direct return
when no cleanup to be done.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Use virCgroupAddTaskController in virCgroupAddTask so we have one
single point where we add tasks to cgroups.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
We honour the placement bitmaps when starting up, so there's no point in
having this check. Additionally the check was buggy since it checked
vm->def all the time even if the user requested to modify the persistent
definition which had different configuration.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1308317
Add Spice graphics gl attribute. qemu 2.6 should have -spice gl=on argument to
enable opengl rendering context (patches on the ML). This is necessary to
actually enable virgl rendering.
Add a qemuxml2argv test for virtio-gpu + spice with virgl.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Per-domain directories were introduced in order to be able to
completely separate security labels for each domain (commit
f1f68ca334). However when the domain
name is long (let's say a ridiculous 110 characters), we cannot
connect to the monitor socket because on length of UNIX socket address
is limited. In order to get around this, let's shorten it in similar
fashion and in order to avoid conflicts, throw in an ID there as well.
Also save that into the status XML and load the old status XMLs
properly (to clean up after older domains). That way we can change it
in the future.
The shortening can be seen in qemuxml2argv tests, for example in the
hugepages-pages2 case.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Found by inspection - after calling virStoragePoolObjAssignDef the
pool is part of the driver->pools.objs list and the failure path
for the virStoragePoolObjSaveDef will use virStoragePoolObjRemove
to remove the pool from the objs list which will unlock and free
the pool pointer (as pools->objs[i] during the loop). Since the call
doesn't clear the pool address from the callee, we need to set it
to NULL; otherwise, the virStoragePoolObjUnlock in the cleanup: code
will fail miserably.
While reviewing how storage driver used ObjListPtr's for reference
in some recent secret driver patches to use the same mechanism, I came
across an instance where the wrong API was called for error paths after
successfully allocating the storage pool pointer and inserting into
the driver pool list.
The path is after virStoragePoolObjAssignDef succeeds - the 'def' passed
in is assigned to pool->def (or newDef) so it shouldn't be the only thing
deleted. The pool is now part of driver->pools.objs, so it would need to
be removed (as happens in the storagePoolCreateXML error paths).
Rather than calling virStoragePoolDefFree to free the def which is now
assigned to the pool, call virStoragePoolObjRemove to ensure the pool
element is removed from the driver list and that anything stored in pool
is properly handled by virStoragePoolObjFree including the call to
virStoragePoolDefFree for the pool->{def|newDef} element.
Commit f1a89a8 allowed parsing configs from /etc/libvirt
without validating the emulator capabilities.
Check for the presence of a machine type in the qemu driver's
post parse function instead of crashing.
https://bugzilla.redhat.com/show_bug.cgi?id=1267256
libxlMakeNic opens a virConnect object and takes a reference on a
virNetwork object, but doesn't drop the references on all error
paths. Rework the function to follow the standard libvirt pattern
of using a local 'ret' variable to hold the function return value,
performing all cleanup and returning 'ret' at a 'cleanup' label.
Generates a false positive for Coverity, but it turns out there's no need
to check ret == -1 since if VIR_APPEND_ELEMENT is successful, the local
vol pointer is cleared anyway.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Found by my Coverity checker - virCheckFlags call could return -1, but
not virCommandFree(destroy_cmd).
Signed-off-by: John Ferlan <jferlan@redhat.com>
When parsing the barrier:limit values, use virStringSplitCount in order
to split the pair and make the approriate checks to get the data.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The virHostdevIsVirtualFunction() was called exactly twice, and in
both cases the return value was saved to a temporary variable before
being checked. This would be okay if it improved readability, but in
this case is pretty pointless.
Get rid of the temporary variable and check the return value
directly; while at it, change the check from '<= 0' to '!= 1' to
align it with the way other similar *IsVirtualFunction() functions
are used thorough the code.
virNetDevIsVirtualFunction() returns 1 if the interface is a
virtual function, 0 if it isn't and -1 on error. This means that,
despite the name suggesting otherwise, using it as a predicate is
not correct.
Fix two callers that were doing so adding an explicit check on
the return value.
Introduce support for domainInterfaceStats API call for querying
network interface statistics. Consequently it also enables the use of
`virsh domifstat <dom> <interface name>` command plus seeing the
interfaces names instead of "-" when doing `virsh domiflist <dom>`.
After successful guest creation we fill the network interfaces names
based on domain, device id and append suffix if it's emulated in the
following form: vif<domid>.<devid>[-emu]. We extract the network
interfaces info from the libxl_domain_config object in
libxlDomainCreateIfaceNames() to generate ifname. On domain cleanup we
also clear ifname, in case it was set by libvirt (i.e. being prefixed
with "vif"). We also skip these two steps in case the name of the
interface was manually inserted by the administrator. Since the
introduction of netprefix (commit a040ba9), ifnames with a registered
prefix will be freed on virDomain{Obj,Def}Format*, thus eliminating
the migration issues observed with the reverted commit d2e5538 whereas
source and destination would have the same ifname.
For getting the interface statistics we resort to virNetInterfaceStats
and let libvirt handle the platform specific nits. Note that the
latter is not yet supported in FreeBSD.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
%zu is not always synonymous with uint64_t; on 32-bit machines,
size_t is only 32 bits. Prefer "%lld"/'unsigned long long' when
the variable is under our control, and "%"PRIu64 when we are
stuck with 'uint64_t' from RBD.
Fixes errors such as:
../../src/storage/storage_backend_rbd.c: In function 'virStorageBackendRBDVolWipe':
../../src/storage/storage_backend_rbd.c:1281:15: error: format '%zu' expects argument of type 'size_t', but argument 8 has type 'uint64_t {aka long long unsigned int}' [-Werror=format=]
VIR_DEBUG("Need to wipe %zu bytes from RBD image %s/%s",
^
../../src/util/virlog.h:90:73: note: in definition of macro 'VIR_DEBUG_INT'
virLogMessage(src, VIR_LOG_DEBUG, filename, linenr, funcname, NULL, __VA_ARGS__)
^
../../src/storage/storage_backend_rbd.c:1281:5: note: in expansion of macro 'VIR_DEBUG'
VIR_DEBUG("Need to wipe %zu bytes from RBD image %s/%s",
^
Signed-off-by: Eric Blake <eblake@redhat.com>
There's this check when building command line that whenever
domain has no graphics card configured we put -nographics onto
qemu command line. The check is 'if (!def->graphics)'. This
makes coverity think that def->graphics can be NULL, which is
true. But later in the code every access to def->graphics is
guarded by check for def->ngraphics, so no crash occurs. But this
is something that coverity fails to deduct.
In order to shut coverity up lets change the condition to
'if (!def->ngraphics)'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After 6604a3dd9f in which new helper function has been
introduced, the code calls virStringReplace and dereference the
result immediately. The string function can, however, return NULL
so this would SIGSEGV right away. Check for the return value of
the string function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After 457ff97fa there are two defects in our code. In both of
them we use a signed variable to hold up a number of snapshots
that domain has. We use a helper function to count the number.
However, the helper function may fail in which case it returns
a negative one and control jumps to cleanup label where an
unsigned variable is used to iterate over array of snapshots. The
loop condition thus compare signed and unsigned variables which
in this specific case ends up badly for us.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
xl/libxl already supports qemu's network-based block backends
such as nbd and rbd. libvirt has supported configuring such
<disk>s for long time too. This patch adds support for rbd
disks in the libxl driver by generating a rbd device URL from
the virDomainDiskDef object. The URL is passed to libxl via the
pdev_path field of libxl_device_disk struct. libxl then passes
the URL to qemu for cosumption by the rbd backend.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The target= setting in xl disk configuration can be used to encode
meta info that is meaningful to a backend. Leverage this fact to
support qdisk network disk types such as rbd. E.g. <disk> config
such as
<disk type='network' device='disk'>
<driver name='qemu' type='raw'/>
<source protocol='rbd' name='pool/image'>
<host name='mon1.example.org' port='6321'/>
<host name='mon2.example.org' port='6322'/>
<host name='mon3.example.org' port='6322'/>
</source>
<target dev='hdb' bus='ide'/>
<address type='drive' controller='0' bus='0' target='0' unit='1'/>
</disk>
can be converted to the following xl config (and vice versa)
disk = [ "format=raw,vdev=hdb,access=rw,backendtype=qdisk,
target=rbd:pool/image:auth_supported=none:mon_host=mon1.example.org\\:6321\\;mon2.example.org\\:6322\\;mon3.example.org\\:6322"
]
Note that in xl disk config, a literal backslash in target= must
be escaped with a backslash. Conversion of <auth> config is not
handled in this patch, but can be done in a follow-up patch.
Also add a test for the conversions.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The most formal form of xl disk configuration uses key=value
syntax to define each configuration item, e.g.
format=raw, vdev=xvda, access=rw, backendtype=phy, target=disksrc
Change the xl disk formatter to produce this syntax, which allows
target= to contain meta info needed to setup a network-based
disksrc (e.g. rbd, nbd, iscsi). For details on xl disk config
format, see $xen-src/docs/misc/xl-disk-configuration.txt
Update the disk config in the tests to use the formal syntax.
But add tests to ensure disks specified with the positional
parameter syntax are correctly converted to <disk> XML.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
It may be useful in some cases to call TristateSwitch helper with TristateBool.
Document that enum values equivalency in the code.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
In case you will specify graphics like this:
<graphics type='spice' port='-1'/>
or
<graphics type='spice' port='-1' tlsPort='6000'/>
libvirt will automatically add autoport='no'. This leads to an issue
that in qemuProcessStop() we don't release that port because we are
releasing both port if autoport=yes or only port marked as reserved.
If autoport=no but we request to generate port via '-1' we need to mark
that port as reserved in order to release it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1299696
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Checking whether x > 0 before looping over [0..x] items doesn't make
sense and multi-line body must have curly brackets around it.
Best viewed with '-w'.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This does nothing more than adding the new device and capability.
The device is present since QEMU 2.6.0.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There's a check if a domain definition has any graphics card and
if so, we iterate over each one of them. This makes no sense,
because even if it has none we can still iterate over.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
GIC v2 is the default, but checking against that specific version when
we want to know whether the default has been selected is potentially
error prone; using an alias instead makes it safer.
Similarly to VM startup always set the legacy affinity. Additionally we
don't need to report an explicit error since virProcessSetAffinity
reports them themselves.
Similarly to VM startup always set the legacy affinity. Additionally we
don't need to report an explicit error since virProcessSetAffinity
reports them themselves.
PostParse handles it for us now.
This causes some test suite churn; qemu's custom PostParse could is
now invoked before the generic AddImplicitControllers, so PCI
controllers end up sequentially in the XML before the generically
added IDE controllers. So it's just some XML reordering
Seems like the natural fit, since we are already adding other XML bits
in the PostParse routine.
Previously AddImplicitControllers was only called at the end of XML
parsing, meaning code that builds a DomainDef by hand had to manually
call it. Now those PostParse callers get it for free.
There's some test churn here; xen xm and sexpr test suite bits weren't
calling this before, but now they are, so you'll see new IDE controllers.
I don't think this will cause problems in practice, since the code already
needs to handle these implicit controllers like in the case when a user
defines their own XML.
virDomainObjWait is designed to be called in a loop. Make sure we break
the loop in case the domain dies to avoid waiting for an event which
will never happen.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Calling qemuProcessStop without a job opens a way to race conditions
with qemuDomainObjExitMonitor called in another thread. A real world
example of such a race condition:
- migration thread (A) calls qemuMigrationWaitForSpice
- another thread (B) starts processing qemuDomainAbortJob API
- thread B signals thread A via qemuDomainObjAbortAsyncJob
- thread B enters monitor (qemuDomainObjEnterMonitor)
- thread B calls qemuMonitorSend
- thread A awakens and calls qemuProcessStop
- thread A calls qemuMonitorClose and sets priv->mon to NULL
- thread B calls qemuDomainObjExitMonitor with priv->mon == NULL
=> monitor stays ref'ed and locked
Depending on how lucky we are, the race may result in a memory leak or
it can even deadlock libvirtd's event loop if it tries to lock the
monitor to process an event received before qemuMonitorClose was called.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Stopping a domain without a job risks a race condition with another
thread which started a job a which does not expect anyone else to be
messing around with the same domain object.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Only a small portion of processGuestPanicEvent was enclosed within a
job, let's make sure we use the job for all operations to avoid race
conditions.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When destroying a domain we need to make sure we will be able to start a
job no matter what other operations are running or even stuck in a job.
This is done by killing the domain before starting the destroy job.
Let's introduce qemuProcessBeginStopJob which combines killing a domain
and starting a job in a single API which can be called everywhere we
need a job to stop a domain.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Ending a nested job is no different from ending any other (non-async)
job, after all the code in qemuDomainBeginJobInternal does not handle
them differently either. Thus we should call qemuDomainObjEndJob to stop
nested jobs.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemuDomainHelperGetVcpus would correctly return an array of
virVcpuInfoPtr structs for online vcpus even for sparse topologies, but
the loop that fills the returned typed parameters would number the vcpus
incorrectly. Fortunately sparse topologies aren't supported yet.
When virt-admin is run with valgrind, this kind of output can be obtained:
HEAP SUMMARY:
in use at exit: 134,589 bytes in 1,031 blocks
total heap usage: 2,667 allocs, 1,636 frees, 496,755 bytes allocated
88 bytes in 1 blocks are definitely lost in loss record 82 of 128
at 0x4C2A9C7: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x52F6D1F: virAllocVar (viralloc.c:560)
by 0x5350268: virObjectNew (virobject.c:193)
by 0x53503E0: virObjectLockableNew (virobject.c:219)
by 0x4E3BBCB: virAdmConnectNew (datatypes.c:832)
by 0x4E38495: virAdmConnectOpen (libvirt-admin.c:209)
by 0x10C541: vshAdmConnect (virt-admin.c:107)
by 0x10C7B2: vshAdmReconnect (virt-admin.c:163)
by 0x10CC7C: cmdConnect (virt-admin.c:298)
by 0x110838: vshCommandRun (vsh.c:1224)
by 0x10DFD8: main (virt-admin.c:862)
LEAK SUMMARY:
definitely lost: 88 bytes in 1 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 134,501 bytes in 1,030 blocks
suppressed: 0 bytes in 0 blocks
This is because virNetClientSetCloseCallback was being reinitialized
incorrectly. By resetting the callbacks in a proper way, the leak is fixed.
A login session with the vSphere API might expire after some idle time.
The esxVI_EnsureSession function uses the SessionIsActive function to
check if the current session has expired and a relogin needs to be done.
But the SessionIsActive function needs the Sessions.ValidateSession
privilege that is considered as an admin level privilege.
Only vCenter actually provides the SessionIsActive function. This results
in requiring an admin level privilege even for read-only operations on
a vCenter server.
ESX and VMware Server don't provide the SessionIsActive function and
the code already works around that. Use the same workaround for vCenter
again.
This basically reverts commit 5699034b65.
Commit f1a89a8 allowed parsing configs from /etc/libvirt
without validating the emulator capabilities.
Check for the presence of os->type.machine even if the
VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS flag is set,
otherwise the daemon can crash on carelessly crafted input
in the config directory.
https://bugzilla.redhat.com/show_bug.cgi?id=1267256
Add new function to manage adding the '-mon' or '-monitor' options to
the command line removing that task from the mainline qemuBuildCommandLine.
Also adjusted qemuBuildChrChardevStr and qemuBuildChrArgStr to use
const virDomainChrSourceDef *def rather than virDomainChrSourceDefPtr def.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-device sga' to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-smbios' options to the command
line removing that task from the mainline qemuBuildCommandLine
Also while I was looking at it, move the uuid processing closer to usage.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-numa' options to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the IOThread '-object' to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rename function and move code in from qemuBuildCommandLine to
keep smp related code together. Also make a few style changes
for long lines, return value change, and 2 spaces between functions.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add new function to manage adding the '-m' memory options to the command
line removing that task from the mainline qemuBuildCommandLine
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create qemuBuildCommandLineValidate to make some checks before trying
to build the command. This will move some logic from much later to much
earlier - we shouldn't be adjusting any data so that shouldn't matter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Now that the file migration doesn't require us to use 'dd' and other
legacy stuff for too old qemus we don't even have to calcuate the
offsets and other stuff.
With the currently supported qemus we always migrate to file
descriptors so the old function is not required any more.
Additionally QEMU_MONITOR_MIGRATE_TO_FILE_TRANSFER_SIZE macro is now
unused.
In cf113e8d we changed the declaration of
virCgroupAllowDevicePath() and virCgroupDenyDevicePath().
However, while updating the stub for non-cgroup platforms for the
former we forgot to update the latter too causing a build
failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This reverts commit 611a278fa4.
According to the original commit message, this is dead code:
It is highly unlikely that a backend will know how to create a
volume from a different volume (buildVolFrom) and not know how to
create an empty volume (createVol).
This API is merely a convenience API, i.e. when managing clients connected to
daemon's servers, we should know (convenience) which server the specific client
is connected to. This implies a client-side representation of a server along
with a basic API to let the administrating client know what servers are actually
available on the daemon.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This is the key structure of all management operations performed on the
daemon/clients. An admin client needs to be able to identify
another client (either admin or non-privileged client) to perform an
action on it. This identification includes a server the client is
connected to, thus a client-side representation of a server is needed.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since the daemon can manage and add (at fresh start) multiple servers,
we also should be able to add them from a JSON state file in case of a
daemon restart, so post exec restart support for multiple servers is also
provided. Patch also updates virnetdaemontest accordingly.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The method will now return 0 on success and -1 on error, rather than number of
items which it iterated over before it returned back to the caller. Since the
only place where we actually check the number of elements iterated is in
virhashtest, return value of 0 and -1 can be a pretty accurate hint that it
iterated over all the items. However, if we really want to know the number of
items iterated over (like virhashtest does), a counter has to be provided
through opaque data to each iterator call. This patch adjusts return value of
virHashForEach, refactors the body, so it returns as soon as one of the
iterators fail and adjusts virhashtest to reflect these changes.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Our existing virHashForEach method iterates through all items disregarding the
fact, that some of the iterators might have actually failed. Errors are usually
dispatched through an error element in opaque data which then causes the
original caller of virHashForEach to return -1. In that case, virHashForEach
could return as soon as one of the iterators fail. This patch changes the
iterator return type and adjusts all of its instances accordingly, so the
actual refactor of virHashForEach method can be dealt with later.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.
Currently there's no functional change.
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
After removing capability check for fd migration the code that was left
behind didn't make quite sense. The old exec migration would be used in
case when pipe() failed. Remove the old code and make failure of pipe()
a hard error.
This additionally removes usage of virCgroupAllowDevicePath outside of
qemu_cgroup.c.
Since no value in the virGICVersion enumeration is negative, a clever
enough compiler can report an error such as
src/conf/domain_conf.c:15337:75: error: comparison of unsigned enum
expression < 0 is always false [-Werror,-Wtautological-compare]
if ((def->gic_version = virGICVersionTypeFromString(tmp)) < 0 ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~
virGICVersionTypeFromString() can, however, return a negative value if
the input string is not part of the enumeration, so we definitely need
that check.
Work around the problem by storing the return value in a temporary int
variable.
Create new modules qemu_domain_address.c and qemu_domain_address.h to
contain all the new functions and header data. Additionally move any
supporting static functions.
Make qemuDomainSupportsPCI non static.
Also, move and rename the following:
qemuSetSCSIControllerModel to qemuDomainSetSCSIControllerModel
qemuCollectPCIAddress to qemuDomainCollectPCIAddress
qemuValidateDevicePCISlotsPIIX3 to qemuDomainValidateDevicePCISlotsPIIX3
qemuAssignDevicePCISlots to qemuDomainAssignDevicePCISlots
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the misplaced function from qemu_command.c to qemu_interface.c
since it's closer in functionality there and had less to do with building
the command line.
Rename function to qemuInterfaceBridgeConnect and modify callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move the misplaced function from qemu_command.c to qemu_interface.c
since it's closer in functionality there and had less to do with building
the command line.
Rename function to qemuInterfaceDirectConnect and modify callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move function closer to where it's used in qemuBuildTPMCommandLine
Also fix function header to match current coding practices
Signed-off-by: John Ferlan <jferlan@redhat.com>
Move function closer to where it's called in qemuBuildTPMCommandLine
Also adjust function header to fit current coding guidelines
Signed-off-by: John Ferlan <jferlan@redhat.com>
We currently blindly accept any numeric value as a GIC version, even
though only GIC v2 and GIC v3 actually exist; on the other hand, we
reject "host", which is a perfectly legitimate value for QEMU guests.
This new enumeration contains all GIC versions libvirt is aware of.
Currently, on hot unplug of PCI devices with VFIO driver for QEMU, libvirt is
trying to restore the host devices to it's previous value (basically a chown
on the previous user/group).
However for devices with VFIO driver, when the device is unbinded it is
removed from the /dev/vfio file system causing the restore label to fail.
The fix is to not restore the label for those PCI devices since they are going
to be teared down anyway.
Signed-off-by: Ludovic Beliveau <ludovic.beliveau@windriver.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The existing log messages for this have several problems; there are
two lines of log when one will suffice, they duplicate the function
name in log message (when it's already included by VIR_DEBUG), they're
missing some useful bits, they get logged even when the call is a NOP.
This patch cleans up the problems with those existing logs, and also
adds a new VIR_INFO-level log down at the function that is actually
creating and sending the netlink message that logs *everything* going
into the netlink message (which turns out to be much more useful in
practice for me; I didn't want to eliminate the logs at the existing
location though, in case they are useful in some scenario I'm
unfamiliar with; anyway those logs are remaining at debug level, so it
shouldn't be a bother to anyone).
There are three functions that deal with allocating and freeing
devices from a networks netdev/pci device pool:
network(Allocate|Notify|Release)ActualDevice(). These functions also
maintain a counter of the number of domains currently using a network
(regardless of whether or not that network uses a device pool). Each
of these functions had multiple log messages (output using VIR_DEBUG)
that were in slightly different formats and gave varying amounts of
information.
This patch creates a single function to log the pertinent information
in a consistent manner for all three of these functions. Along with
assuring that all the functions produce a consistent form of output
(and making it simpler to change), it adds the MAC address of the
domain interface involved in the operation, making it possible to
verify which interface of which domain the operation is being done for
(assuming that all MAC addresses are unique, of course).
All of these messages are raised from DEBUG to INFO, since they don't
happen that often (once per interface per domain/libvirtd start or
domain stop), and can be very informative and helpful - eliminating
the need to log debug level messages makes it much easier to sort
these out.
networkReleaseActualDevice() and networkNotifyActualDevice() both were
updating the individual devices' connections count in two separate
places (unlike networkAllocateActualDevice() which does it in a single
unified place after success:). The code is correct, but prone to
confusion / later breakage. All of these updates are anyway located at
the end of if/else clauses that are (with the exception of a single
VIR_DEBUG() in each case) immediately followed by the success: label
anyway, so this patch replaces the duplicated ++/-- instructions with
a single ++/-- inside a qualifying "if (dev)" down below success:.
(NB: if dev != NULL, by definition we are using a device (either pci
or netdev, doesn't matter for these purposes) from the network's pool)
The VIR_DEBUG args (which will be replaced in a followup patch anyway)
were all adjusted to account for the connection count being out of
date at the time.
Since Ceph version Infernalis (9.2.0) the new fast-diff mechanism
of RBD allows for querying actual volume usage.
Prior to this version there was no easy and fast way to query how
much allocation a RBD volume had inside a Ceph cluster.
To use the fast-diff feature it needs to be enabled per RBD image
and is only supported by Ceph cluster running version Infernalis
(9.2.0) or newer.
Without the fast-diff feature enabled libvirt will report an allocation
identical to the image capacity. This is how libvirt behaves currently.
'virsh vol-info rbd/image2' might output for example:
Name: image2
Type: network
Capacity: 1,00 GiB
Allocation: 124,00 MiB
Newly created volumes will have the fast-diff feature enabled if the
backing Ceph cluster supports it.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
In commit 0b15f920 there is a #ifdef which requires LIBRBD_VERSION_CODE
266 or newer for rbd_diff_iterate2()
rbd_diff_iterate2() is available since 266, so this if-statement should
require anything newer than 265.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
As more and more features are added to RBD volumes we will need to
call this method more often.
By moving it into a internal function we can re-use code inside the
storage backend.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
It is highly unlikely that a backend will know how to create a
volume from a different volume (buildVolFrom) and not know how to
create an empty volume (createVol). But:
1) we call the function without any prior check so if that's the
case we would SIGSEGV immediatelly
2) it's better to be safe than sorry.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Firstly, we realloc internal list to hold new item (=volume that
will be potentially created) and then we check whether we
actually know how to create it. If we don't we consume more
memory than we really need for no good reason.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Race condition:
User calls defineXML to create new instance.
The main thread from vzDomainDefineXMLFlags() creates new instance by prlsdkCreateVm.
Then this thread calls prlsdkAddDomain to add new domain to domains list.
The second thread receives notification from hypervisor that new VM was created.
It calls prlsdkHandleVmAddedEvent() and also tries to add new domain to domains list.
These two threads call virDomainObjListFindByUUID() from prlsdkAddDomain() and don't find new domain.
So they add two domains with the same uuid to domains list.
This fix splits logic of prlsdkAddDomain() into two functions.
1. vzNewDomain() creates new empty domain in domains list with the specific uuid.
2. prlsdkLoadDomain() add data from VM to domain object.
New algorithm for creating an instance:
In vzDomainDefineXMLFlags() we add new domain to domain list by calling vzNewDomain()
and only after that we call CreateVm() to create VM.
It means that we "reserve" domain object with the specific uuid.
After creation of new VM we add info from this VM
to reserved domain object by calling prlsdkLoadDomain().
Before this patch prlsdkLoadDomain() worked in 2 different cases:
1. It creates and initializes new domain. Then updates it from sdk handle.
2. It updates existed domain from sdk handle.
In this patch we remove code which creates new domain from LoadDomain()
and move it to vzNewDomain().
Now prlsdkLoadDomain() only updates domain from skd handle.
In notification handler prlsdkHandleVmAddedEvent() we check
the existence of a domain and if it doesn't exist we add new domain by calling
vzNewDomain() and load info from sdk handle via prlsdkLoadDomain().
Bug cause:
Update the domain that is subscribed to hypervisor notification.
LoadDomain() rewrites notifications fields in vzDomObj structure and makes domain as "unsubscribed".
Fix:
Initialize notification fields in vzDomObj only if we create a new domain.
And do not reinitialize these fields if we update domain (by calling LoadDomain with olddom argument)
prlsdkGetDomainIds() returns name and uuid for specified instance.
Now output arguments can be NULL.
It allows to get only necessary info(name or uuid).
The snapshot name generator truncates the original file name after a '.'
and replaces the suffix with the snapshot name. If two disks source
images differ only in the suffix portion, the generated name will be
duplicate.
Since this is a corner case just error out stating that a duplicate name
was generated. The user can work around this situation by providing
the file names explicitly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1283085
Apparently we are not the only ones with dumb free functions
because dbus_message_unref() does not accept NULL either. But if
I were to vote, this one is even more evil. Instead of returning
an error just like we do it immediately dereference any pointer
passed and thus crash you app. Well done DBus!
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f878ebda700 (LWP 31264)]
0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
(gdb) bt
#0 0x00007f87be4016e5 in ?? () from /usr/lib64/libdbus-1.so.3
#1 0x00007f87be3f004e in dbus_message_unref () from /usr/lib64/libdbus-1.so.3
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
#3 0x00007f879761bd4d in qemuConnectCgroup (driver=0x7f87600a32a0, vm=0x7f87600c7550) at qemu/qemu_cgroup.c:909
#4 0x00007f87976386b7 in qemuProcessReconnect (opaque=0x7f87600db840) at qemu/qemu_process.c:3386
#5 0x00007f87bf6edfff in virThreadHelper (data=0x7f87600d5580) at util/virthread.c:206
#6 0x00007f87bb602334 in start_thread (arg=0x7f878ebda700) at pthread_create.c:333
#7 0x00007f87bb3481bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) frame 2
#2 0x00007f87bf6ecf95 in virSystemdGetMachineNameByPID (pid=9849) at util/virsystemd.c:228
228 dbus_message_unref(reply);
(gdb) p reply
$1 = (DBusMessage *) 0x0
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Do not store the return value of called functions in the same variable
as the (future) return value of the current function.
This makes tracking the origin of the value easier and reduces
the chance of introducing a new point of exit without resetting
the return value back to -1.
https://bugzilla.redhat.com/show_bug.cgi?id=1293351
Since we already have virtio channel events, we know when guest
agent within guest has (dis-)connected. Instead of us blindly
connecting to a socket that no one is listening to, we can just
follow what qemu-ga does. This has a nice benefit that we don't
need to 'guest-ping' the agent just to timeout and find out
nobody is listening.
The way that this commit is implemented:
- don't connect in qemuProcessLaunch directly, defer that to event
callback (which already follows the agent) -
processSerialChangedEvent
- after migration is settled, before we resume vCPUs, ask qemu
whether somebody is listening on the socket and if so, connect
to it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Extract out the qemuParseCommandLine{String|Pid} into their own
separate module - taking with it all the various static functions.
Causes a ripple effect with a few other modules to include the
new qemu_parse_command.h.
Narrowed down the list of #include's in the split out module to
those that are necessary for build.
Recent refactors in the vbox code to check the return status for the
function tipped Coverity's scales of justice for any functions that
do not check status - such as this one.
While I'm at it, since the call is essentially the same other than
whether starting from val or val+1 when val[0] = '[', just adjust
the val pointer by one and have one call instead of two.
Additionally, the call to virDomainGraphicsListenGetAddress is redundant
since it checking that the address field got filled. It's a leftover
from the strndup -> ListenSetAddress conversion (commit id 'ef79fb5b5')
Signed-off-by: John Ferlan <jferlan@redhat.com>
Refactor qemuParseCommandLine to pull out the "-vnc" argument parsing
into its own helper function. Modify the code to use "cleanup" instead
of "error" and use the standard return processing to indicate success
or failure by using ret
Signed-off-by: John Ferlan <jferlan@redhat.com>
The function, like others in our code, returns zero on success
and a negative value on error. However, there are two places in
xenconfig source code where we check for non-zero value. While
the function can't currently return a positive value, those
checks look okay, but does not really follow our style.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This patch introduces keep alive messages support for P2P migration
and it adds two new configuration entries namely 'keepalive_interval'
'keepalive_count' to control it. Behavior of these entries is the
same as qemu driver thus the description is copied from there
with just a few simplifications.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Introduce support for VIR_MIGRATE_PEER2PEER in libvirt migration.
Most of the changes occur at the source and no modifications at
the receiver.
In P2P mode there is only the Perform phase so we must handle the
connection with the destination and actually perform the
migration. libxlDomainPerformP2P implements the connection to the
destination and libxlDoMigrateP2P implements the actual migration
logic with virConnectPtr. In this function we take care of doing
all phases of migration in the destination similar to
virDomainMigrateVersion3Full. We appropriately save the last
error reported in each of the phases to provide proper reporting.
We don't yet support VIR_MIGRATE_TUNNELED and we always use V3
with extensible params, thus it also makes the implementation
simpler.
It is worth noting that the receiver didn't have any changes, and
since it's still the v3 sequence thus it is possible to migrate
from a P2P to non-P2P host.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The virStringListLength function does not ever modify the passed
string list. It merely counts the items in it. Make sure that we
reflect this bit in the function header.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(crobinso: fix up spacing and squash in sheepdog bit suggested
by Andrea)
Older gcc fails to see that the variable is set iff @hasPriority
== true in which case the former is set a value. Initialize the
value while declaring it to make the compiler shut up.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Allocate it as soon as we know we will need it.
Add it to def->ngraphics if it's allocated, removing the need
to use the addDesktop and totalPresent variables to track this.
Separate allocation of the def->graphics array from the allocation
and initialization of its first element.
Note that the only possible values of totalPresent at this point
are 0 or 1, because it equals to guiPresent + sdlPresent.
When FRONTEND/Type is not any of "sdl", "gui", "vrdp", we add a DESKTOP.
Use a bool to track this, instead of checking that both
totalPresent ("sdl" or "gui" present) and vrdpPresent are zero.
For the actions ADD and OLD, split out creating the new lease object,
as well as getting the environment variables that do not affect
the parsing of command line arguments.
Since majority of the steps is shared, the function can be reused to
simplify code.
Similarly to previous path doing this same for vCPUs this also fixes the
a similar bug (which is not tracked).
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting IOThread info on
hotplug.
Since majority of the steps is shared, the function can be reused to
simplify code.
Additionally this resolves
https://bugzilla.redhat.com/show_bug.cgi?id=1244128 since the cpu
bandwidth limiting with cgroups would not be set on the hotplug path.
Additionally the code now sets the thread affinity and honors autoCpuset
as in the regular startup code path.
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting vCPU info on hotplug.
With this approach autoCpuset is also used when setting the process
affinity rather than just via cgroups.
Commit 8cd1d54 consolidates both daemon and remote driver typed param
serialization functions. The consolidation now enforces client to use
VIR_TYPED_PARAM_STRING_OKAY flag to properly serialize string parameters, which
server has used for quite some time now. And this caused an issue, since the
commit had not adjusted client remote calls appropriately, thus causing a
failure in blkiotune, numatune and migration APIs (as per Xen CI tests). This
patch adjusts both remote_driver.c and gendispatch.pl to properly address this
issue.
http://lists.xenproject.org/archives/html/xen-devel/2016-02/msg01012.html
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
In the commit 7938b533 we've changed the function signature,
however forgot to update stump that's used on systems without
CGroups causing a build failure.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Due to bad design the vcpu sched element is orthogonal to the way how
the data belongs to the corresponding objects. Now that vcpus are a
struct that allow to store other info too, let's convert the data to the
sane structure.
The helpers for the conversion are made universal so that they can be
reused for iothreads too.
This patch also resolves https://bugzilla.redhat.com/show_bug.cgi?id=1235180
since with the correct storage approach you can't have dangling data.
Now that qemuDomainDetectVcpuPids is able to refresh the vCPU pid
information it can be reused in the hotplug and hotunplug code paths
rather than open-coding a very similar algorithm.
A slight algorithm change is necessary for unplug since the vCPU needs
to be marked offline prior to calling the thread detector function and
eventually rolled back if something fails.
Move the logic from virDomainDiskDefDstDuplicates into
virDomainDiskDefCheckDuplicateInfo so that we don't have to loop
multiple times through the array of disks. Since the original function
was called in qemuBuildDriveDevStr, it was actually called for every
single disk which was quite wasteful.
Additionally the target uniqueness check needed to be duplicated in
the disk hotplug case, since the disk was inserted into the domain
definition after the device string was formatted and thus
virDomainDiskDefDstDuplicates didn't do anything in that case.
When starting a qemu process there are certain checks done to ensure
that the configuration makes sense. Extract them into a separate
function so that they can be reused in the test code.
In b3d2a42e I've refactored the code and moved the 'cleanup' label.
Unfortunately the code that was originally in the 'endjob' label and
wanted to jump to cleanup is now in the cleanup label. Remove the jump
and let the function finish.
If we are attempting to thaw the filesystems on error, the code would
overwrite the error code that caused the snapshot to fail with the error
of thawing the filesystem. Since the thawing function allows control of
error reporting behavior we can use this feature.
Add a stub for nodeDeviceSysfsGetPCIRelatedDevCaps() for non-Linux
platforms. It allows nodedev driver to work on non-Linux platoforms
that, however, have HAL.
Syntax-check fails with:
cppi: src/bhyve/bhyve_driver.h: line 26: not properly indented
cppi: src/bhyve/bhyve_driver.h: line 27: not properly indented
maint.mk: incorrect preprocessor indentation
Fix by properly indenting '#include's.
Pushed as trivial.
After 1036ddadb2 we use bhyveDriverGetCapabilities from other
sources too, not only from bhyve_driver.c. However, the function
was static so not properly expose to other files. In order to
expose it, we need to move couple of #include-s too.
Then, there has been a copy paste error in
virBhyveProcessReconnect: s/privconn/data->driver/.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This was only used in debugging messages and not in any real code.
Ceph/RBD uses uint64_t for sizes internally and they can be printed
with %zu without any need for casting.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Through the years the RBD storage pool code hasn't maintained the
same or correct coding standard which applies to libvirt.
This patch doesn't change any logic in the code, it only applies
the proper coding standards to the code where possible without
making large changes.
This way the code style used in this storage pool is consistent
throughout the whole file.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
I've noticed that variable @reply is not initialized and if
something at the beginning of the function fails, e.g.
virDBusGetSystemBus(), the control jump straight to cleanup label
where dbus_message_unref() is then called over this uninitialized
variable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
I've noticed couple of warning in dmesg while debugging
something:
[ 9683.973754] HTB: quantum of class 10001 is big. Consider r2q change.
[ 9683.976460] HTB: quantum of class 10002 is big. Consider r2q change.
I've read the HTB documentation and linux kernel code to find out
what's wrong. Basically we need to pass another argument
"quantum" to our tc cmd line because the default computed by HTB
does not always work in which case the warning message is printed
out.
You can read more details here:
http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm#sharing
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
As the scheduler info elements are represented orthogonally to how it
makes sense to actually store the information, the extracted code will
be later used when converting between XML and internal definitions.
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules. But we always allowed
international characters in domain names. Thus we need to modify the
machine name we are passing to systemd.
In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain. Even for domains that were started with older libvirt. That
can be achieved thanks to virSystemdGetMachineNameByPID(). And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.
So this patch modifies the naming in the following way. It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters. That way we can start domains we
couldn't start before. Well, at least on systemd.
To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data. That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.
The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The virDomainSnapshotDefFormat calls into virDomainDefFormat,
so should be providing a non-NULL virCapsPtr instance. On the
qemu driver we change qemuDomainSnapshotWriteMetadata to also
include caps since it calls virDomainSnapshotDefFormat.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The virDomainObjFormat and virDomainSaveStatus methods
both call into virDomainDefFormat, so should be providing
a non-NULL virCapsPtr instance.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Rather than have a unwieldy regex string - split it up into its components
each having it's own #define and then combine in a different #define
Signed-off-by: John Ferlan <jferlan@redhat.com>
Use the newly added virCapabilitiesSetNetPrefix to set
the network prefix for the driver. This in return will
be use by NetDefFormat() and NetDefParseXML() routines
to free any interface name that start with the registered
prefix.
Acked-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
virDomainSaveConfig calls virDomainDefFormat which was setting the caps
to NULL, thus keeping the old behaviour (i.e. not looking at
netprefix). This patch adds the virCapsPtr to the function and allows
the configuration to be saved and skipping interface names that were
registered with virCapabilitiesSetNetPrefix().
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
And use the newly added caps->host.netprefix (if it exists) for
interface names that match the autogenerated target names.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
And use the newly added caps->host.netprefix for free interface
names that match the autogenerated target names.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
In the reverted commit d2e5538b1, the libxl driver was changed to copy
interface names autogenerated by libxl to the corresponding network def
in the domain's virDomainDef object. The copied name is freed when the
domain transitions to the shutoff state. But when migrating a domain,
the autogenerated name is included in the XML sent to the destination
host. It is possible an interface with the same name already exists on
the destination host, causing migration to fail.
This patch defines a new capability for setting the network device
prefix that will be used in the driver. Valid prefixes are
VIR_NET_GENERATED_PREFIX or the one announced by the driver.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
There are slight differences in various ZFS implementations.
Specifically, ZFS on FreeBSD requires to set value of 'volmode'
option to 'dev' to expose volumes as raw disk device (that's what
we need) rather than geom provides, for example.
With ZFS on Linux, however, such option is not available and
volumes exposed like we need by default.
To make our implementation more flexible, only pass 'volmode'
when it's supported. Support is checked by parsing usage
information of the 'zfs get' command.
Same as for deserializer, this method might get handy for admin one day.
The major reason for this patch is to stay consistent with idea, i.e.
when deserializer can be shared, why not serializer as well. The only
problem to be solved was that the daemon side serializer uses a code
snippet which handles sparse arrays returned by some APIs as well as
removes any string parameters that can't be returned to older clients.
This patch makes of the new virTypedParameterRemote datatype introduced
by one of the pvious patches.
Since the method is static to remote_driver, it can't even be used by our
daemon. Other than that, it would be useful to be able to use it with admin as
well. This patch uses the new virTypedParameterRemote datatype introduced in
one of previous patches.
Currently, the deserializer is hardcoded into remote_driver which makes
it impossible for admin to use it. One way to achieve a shared implementation
(besides moving the code to another module) would be pass @ret_params_val as a
void pointer as opposed to the remote_typed_param pointer and add a new extra
argument specifying which of those two protocols is being used and typecast
the pointer at the function entry. An example from remote_protocol:
struct remote_typed_param_value {
int type;
union {
int i;
u_int ui;
int64_t l;
uint64_t ul;
double d;
int b;
remote_nonnull_string s;
} remote_typed_param_value_u;
};
typedef struct remote_typed_param_value remote_typed_param_value;
struct remote_typed_param {
remote_nonnull_string field;
remote_typed_param_value value;
};
That would leave us with a bunch of if-then-elses that needed to be used across
the method. This patch takes the other approach using the new datatype
introduced in one of earlier commits.
Both admin and remote protocols define their own types
(remote_typed_param vs admin_typed_param). Because of the naming convention,
admin typed params wouldn't be able to reuse the serialization/deserialization
methods, which are tailored for use by remote protocol, even if those method
were exported properly. In that case, introduce a new internal data type
structurally copying both admin and remote protocols which, eventually, would
allow serializer and deserializer to be used in a more generic way.
A pretty nasty deadlock occurs while trying to rename a VM in parallel
with virDomainObjListNumOfDomains.
The short description of the problem is as follows:
Thread #1:
qemuDomainRename:
------> aquires domain lock by qemuDomObjFromDomain
---------> waits for domain list lock in any of the listed functions:
- virDomainObjListFindByName
- virDomainObjListRenameAddNew
- virDomainObjListRenameRemove
Thread #2:
virDomainObjListNumOfDomains:
------> aquires domain list lock
---------> waits for domain lock in virDomainObjListCount
Introduce generic virDomainObjListRename function for renaming domains.
It aquires list lock in right order to avoid deadlock. Callback is used
to make driver specific domain updates.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Free the old vcpupids array in case when this function is called again
during the run of the VM. It will be later reused in the vCPU hotplug
code. The function now returns the number of detected VCPUs.
In some cases it may be better to have a bitmap representing state of
individual vcpus rather than iterating the definition. The new helper
creates a bitmap representing the state from the domain definition.
Use 'ret' for return variable name, clarify use of 'param_idx' and avoid
unnecessary 'success' label. No functional changes. Also document the
function.
Since commit 7140807917 we are generating
socket path later than before -- when starting a domain. That makes one
particular inconsistent state of a chardev, which was not possible
before, currently valid. However, SELinux security driver forgot to
guard the main restoring function by a check for NULL-paths. So make it
no-op for NULL paths, as in the DAC driver.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1300532
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In case of guest panicked, preserved crashed domain has stopped CPUs.
It's not possible to use tools like WinDbg for the problem investigation
until we start CPUs back.
Error paths after sending the event that domain is started written as if ret = -1
which is set at the beginning of the function. It's common idioma to keep 'ret'
equal to -1 until the end of function where it is set to 0. But here we use ret
to keep result of restore operation too and thus breaks the idioma and its users :)
Let's use different variable to hold restore result.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Rather than a loop reallocating space to build the regex, just allocate
it once up front, then if there's more than 1 nextent, append a comma and
another regex_unit string.
Signed-off-by: John Ferlan <jferlan@redhat.com>
The 'stripes' value is described as the "Number of stripes or mirrors in
a logical volume". So add "mirror" and anything that starts with "raid"
to the list of segtypes that can have an 'nextents' value greater than one.
Use of raid segtypes (raid1, raid4, raid5*, raid6*, and raid10) is favored
over mirror in more recent lvm code.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than preallocating a set number of elements, then walking through
the extents and adjusting the specific element in place, use the APPEND
macros to handle that chore.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Having on_crash set to either coredump-destroy or coredump-restart
creates core dumps with option memory-only in the directory specified
by auto_dump_path. When a watchdog is triggered with the action dump
the core dump is also placed into the directory specified by auto_dump_path
but is created without the option memory-only.
This patch sets the option memory-only also for core dumps created by the
watchdog event.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Create a helper routine in order to parse any extents information
including the extent size, length, and the device string contained
within the generated 'lvs' output string.
A future patch would then be able to avoid the code more cleanly
Signed-off-by: John Ferlan <jferlan@redhat.com>
By opening a RBD volume in Read-Only we do not register a
watcher on the header object inside the Ceph cluster.
Refreshing a volume only calls rbd_stat() which is a operation
which does not write to a RBD image.
This allows us to use a cephx user which has no write
permissions if we would want to use the libvirt storage pool
for informational purposes only.
It also saves us a write into the Ceph cluster which should
speed up refreshing a RBD pool.
rbd_open_read_only() is available in all librbd versions which
also support rbd_open().
Signed-off-by: Wido den Hollander <wido@widodh.nl>
RBD supports cloning by creating a snapshot, protecting it and create
a child image based on that snapshot afterwards.
The RBD storage driver will try to find a snapshot with zero deltas between
the current state of the original volume and the snapshot.
If such a snapshot is found a clone/child image will be created using
the rbd_clone2() function from librbd.
rbd_clone2() is available in librbd since Ceph version Dumpling (0.67) which
dates back to August 2013.
It will use the same features, strip size and stripe count as the parent image.
This implementation will only create a single snapshot on the parent image if
never changes. This reduces the amount of snapshots created for that RBD image
which benefits the performance of the Ceph cluster.
During build the decision will be made to use either rbd_diff_iterate() or
rbd_diff_iterate2().
The latter is faster, but only available on Ceph versions after 0.94 (Hammer).
Cloning is only supported if RBD format 2 is used. All images created by libvirt
are already format 2.
If a RBD format 1 image is used as the original volume the backend will report
a VIR_ERR_OPERATION_UNSUPPORTED error.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Using VIR_STORAGE_VOL_WIPE_ALG_TRIM a RBD volume can be trimmed down
to 0 bytes using rbd_discard()
Effectively all the data on the volume will be lost/gone, but the volume
remains available for use afterwards.
Starting at offset 0 the storage pool will call rbd_discard() in stripe
size * count increments which is usually 4MB. Stripe size being 4MB and
count 1.
rbd_discard() is available since Ceph version Dumpling (0.67) which dates
back to August 2013.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
This new algorithm adds support for wiping volumes using TRIM.
It does not overwrite all the data in a volume, but it tells the
backing storage pool/driver that all bytes in a volume can be
discarded.
It depends on the backing storage pool how this is handled.
A SCSI backend might send UNMAP commands to remove all data present
on a LUN.
A Ceph backend might use rbd_discard() to instruct the Ceph cluster
that all data on that RBD volume can be discarded.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When wiping the RBD image will be filled with zeros started
at offset 0 and until the end of the volume.
This will result in the RBD volume growing to it's full allocation
on the Ceph cluster. All data on the volume will be overwritten
however, making it unavailable.
It does NOT take any RBD snapshots into account. The original data
might still be in a snapshot of that RBD volume.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Use the cast of (virStorageVolWipeAlgorithm) adding the missing case:'s
(VIR_STORAGE_VOL_WIPE_ALG_ZERO and VIR_STORAGE_VOL_WIPE_ALG_LAST).
Additionally, the old code would also still run the SCRUB command on
default since it didn't go to cleanup when a invalid flag was supplied.
We now go to cleanup and exit if a invalid flag would be provided.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
When commit id '82c1740a' made changes to the output format (changing from
using a ',' separator to '#'), the examples in the lvs output from the
comments weren't changed.
Additionally, the two new fields added ('segtype' and 'stripes') were
not included in the output, leaving it well confusing.
This patch fixes the sample output, adds a 'striped' example, and makes
other comment related adjustments for long line and spacing between followup
'NB' remarks (while I'm there).
Signed-off-by: John Ferlan <jferlan@redhat.com>
The affected functions are:
virPCIDeviceGetManaged()
virPCIDeviceGetUnbindFromStub()
virPCIDeviceGetRemoveSlot()
virPCIDeviceGetReprobe()
Change their return type from unsigned int to bool: the corresponding
members in struct _virPCIDevice are defined as bool, and even the
corresponding virPCIDeviceSet*() functions take a bool value as input
so there's no point in these functions having unsigned int as return
type.
Suggested-by: John Ferlan <jferlan@redhat.com>
In our generator for some code we put empty lines in the output
to separate blocks of code. However, in some cases we put couple
of spaces on the empty line too. It's not bug, it just isn't
nice.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Unbinding a PCI device from the stub driver can require several steps,
and it can be useful for debugging to be able to trace which of these
steps are performed and which are skipped for each device.
The name is confusing, and there are just two uses: one is a test case,
and the other will be removed as part of an upcoming refactoring of
the hostdev code.
Commit 871e10f fixed a memory corruption error, but called strlen()
twice on the same string to do so. Even though the compiler is
probably smart enough to optimize the second call away, having a
single invocation makes the code slightly cleaner.
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
In 370608b4c7 we have introduced two new internal APIs.
However, there are no stubs for build without macvtap. Therefore
build on systems lacking macvtap support (e.g. mingw or freebds)
fails when trying to link.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libvirtd crashes on free()ing portData for an open vswitch port if that port
was deleted. To reproduce:
ovs-vsctl del-port vnet0
virsh migrate --live kvm1 qemu+ssh://dstHost/system
Error message:
libvirtd: *** Error in `/usr/sbin/libvirtd': free(): invalid pointer: 0x000003ff90001e20 ***
The problem is that virCommandRun can return an empty string in the event that
the port being queried does not exist. When this happens then we are
unconditionally overwriting a newline character at position strlen()-1. When
strlen is 0, we overwrite memory that does not belong to the string.
The fix: Only overwrite the newline if the string is not empty.
Reviewed-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
This patch creates two bitmaps, one for macvlan device names and one
for macvtap. The bitmap position is used to indicate that libvirt is
currently using a device with the name macvtap%d/macvlan%d, where %d
is the position in the bitmap. When requested to create a new
macvtap/macvlan device, libvirt will now look for the first clear bit
in the appropriate bitmap and derive the device name from that rather
than just starting at 0 and counting up until one works.
When libvirtd is restarted, the qemu driver code that reattaches to
active domains calls the appropriate function to "re-reserve" the
device names as it is scanning the status of running domains.
Note that it may seem strange that the retry counter now starts at
8191 instead of 5. This is because we now don't do a "pre-check" for
the existence of a device once we've reserved it in the bitmap - we
move straight to creating it; although very unlikely, it's possible
that someone has a running system where they have a large number of
network devices *created outside libvirt* named "macvtap%d" or
"macvlan%d" - such a setup would still allow creating more devices
with the old code, while a low retry max in the new code would cause a
failure. Since the objective of the retry max is just to prevent an
infinite loop, and it's highly unlikely to do more than 1 iteration
anyway, having a high max is a reasonable concession in order to
prevent lots of new failures.
In the following cases nl_recv() was returning the error "No buffer
space available":
* When switching CPUs to offline/online in a system more than 128 cpus
* When using virsh to destroy domain in a system with many interfaces
This patch sets the buffer size for all netlink sockets created by
libnl to 128K and turns on message peeking for nl_recv(). This
eliminates the "No buffer space available" errors seen in the cases
above, and also preempts other future errors the smaller buffers could
have caused.
Signed-off-by: Leno Hou <houqy@linux.vnet.ibm.com>
Signed-off-by: Laine Stump <laine@laine.org>
The current code was a little bit odd. At first we've removed all
possible implicit input devices from domain definition to add them later
back if there was any graphics device defined while parsing XML
description. That's not all, while formating domain definition to XML
description we at first ignore any input devices with bus different to
USB and VIRTIO and few lines later we add implicit input devices to XML.
This seems to me as a lot of code for nothing. This patch may look
to be more complicated than original approach, but this is a preferred
way to modify/add driver specific stuff only in those drivers and not
deal with them in common parsing/formating functions.
The update is to add those implicit input devices into config XML to
follow the real HW configuration visible by guest OS.
There was also inconsistence between our behavior and QEMU's in the way,
that in QEMU there is no way how to disable those implicit input devices
for x86 architecture and they are available always, even without graphics
device. This applies also to XEN hypervisor. VZ driver already does its
part by putting correct implicit devices into live XML.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
In dc576025c3 we renamed virCgroupIsolateMount function to
virCgroupBindMount. However, we forgot about one occurrence in
section of the code which provides stubs for platforms without
support for CGroups like *BSD for instance.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On the host when we start a container, it will be
placed in a cgroup path of
/machine.slice/machine-lxc\x2ddemo.scope
under /sys/fs/cgroup/*
Inside the containers' namespace we need to setup
/sys/fs/cgroup mounts, and currently will bind
mount /machine.slice/machine-lxc\x2ddemo.scope on
the host to appear as / in the container.
While this may sound nice, it confuses applications
dealing with cgroups, because /proc/$PID/cgroup
now does not match the directory in /sys/fs/cgroup
This particularly causes problems for systems and
will make it create repeated path components in
the cgroup for apps run in the container eg
/machine.slice/machine-lxc\x2ddemo.scope/machine.slice/machine-lxc\x2ddemo.scope/user.slice/user-0.slice/session-61.scope
This also causes any systemd service that uses
sd-notify to fail to start, because when systemd
receives the notification it won't be able to
identify the corresponding unit it came from.
In particular this break rabbitmq-server startup
Future kernels will provide proper cgroup namespacing
which will handle this problem, but until that time
we should not try to play games with hiding parent
cgroups.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The VIR_DOMAIN_STATS_VCPU flag to virDomainListGetStats
enables reporting of stats about vCPUs. Currently we
only report the cumulative CPU running time and the
execution state.
This adds reporting of the wait time - time the vCPU
wants to run, but the host scheduler has something else
running ahead of it.
The data is reported per-vCPU eg
$ virsh domstats --vcpu demo
Domain: 'demo'
vcpu.current=4
vcpu.maximum=4
vcpu.0.state=1
vcpu.0.time=1420000000
vcpu.0.wait=18403928
vcpu.1.state=1
vcpu.1.time=130000000
vcpu.1.wait=10612111
vcpu.2.state=1
vcpu.2.time=110000000
vcpu.2.wait=12759501
vcpu.3.state=1
vcpu.3.time=90000000
vcpu.3.wait=21825087
In implementing this I notice our reporting of CPU execute
time has very poor granularity, since we are getting it
from /proc/$PID/stat. As a future enhancement we should
prefer to get CPU execute time from /proc/$PID/schedstat
or /proc/$PID/sched (if either exist on the running kernel)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Report
error: invalid argument: requested vcpu '100' is not present in the domain
instead of
error: invalid argument: requested vcpu is higher than allocated vcpus
Since 'savevm' was not converted to QMP libvirt has to parse for error
strings in the text monitor output. One of the unhandled errors is
produced when qemu treats a device as unmigratable.
As current qemu actually does support AHCI migration this bug is
applicable only to older versions of qemu.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1293899
Make bhyveload respect boot order as specified by os.boot section of the
domain XML or by "boot order" for specific devices. As bhyve does not
support a real boot order specification right now, it's just about
choosing a single device to boot from.
libvirt always resets the MAC address of the physdev used for macvtap
passthrough when the guest is finished with it. This was happening
prior to the 802.1Qb[gh] DISASSOCIATE command, and was quite often
failing, presumably because the driver wouldn't allow the MAC address
to be reset while the association was still active, with a log message
like this:
virNetDevSetMAC:168 : Cannot set interface MAC to 00:00:00:00:00:00 on 'eth13': Cannot assign requested address
This patch changes the order - we now do the 802.1Qb[gh] disassociate
and delete the macvtap interface first, then and reset the MAC
address.
'free' on fedora23 wants to use the Slab field for calculated used
memory. The equation is:
used = MemTotal - MemFree - (Cached + Slab) - Buffers
We already set Cached and Buffers to 0, do the same for Slab and its
related values
https://bugzilla.redhat.com/show_bug.cgi?id=1300781
'free' on Fedora 23 will use MemAvailable to calculate its 'available'
field, but we are passing through the host's value. Set it to match
MemFree, which is what 'free' will do for older linux that don't have
MemAvailable
https://bugzilla.redhat.com/show_bug.cgi?id=1300781
We virtualize bits of /proc/meminfo by replacing host values with
values specific to the container.
However for calculating the final size of the returned data, we are
using the size of the original file and not the altered copy, which
could give garbelled output.
... and consolidate the cmdline/extra/root parsing to facilitate doing
so.
The logic is the same as xl's parse_cmdline from the current xen.git master
branch (e6f0e099d2c17de47fd86e817b1998db903cab61).
On the formatting side switch to producing cmdline= instead of extra=.
Update a few tests and add serveral more.
- test-cmdline is added to test the exclusive use of cmdline.
- test-fullvirt-direct-kernel-boot.cfg is updated due to the switch
on the formatting side and now tests the exclusive use of cmdline=.
- Tests are added for both paravirt and fullvirt where the .cfg uses
extra= and (paravirt only) root=. These are format (xl->xml) only
since the inverse will generate cmdline= hence is not a round trip
(which was already true if using root=, which used to generate
extra= on the way back).
- Tests are added for both paravirt and fullvirt where the .cfg
declares cmdline= as well as bogus extra= and (paravirt only) root=
entries which should be ignored. Again these are format only tests
since the inverse won't include the bogus lines.
The last two bullets here required splitting the DO_TEST macro into
two halves, as is done in the xmconfigtest.c case.
In order to introduce a use of VIR_WARN for logging I had to add
virerror.h and VIR_LOG_INIT.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
As suggested in a previous thread [0] this patch adds some missing calls
to libxl_dominfo_{init,dispose} when doing some of the libxl_domain_info
operations which would otherwise lead to memory leaks.
[0]
https://www.redhat.com/archives/libvir-list/2015-September/msg00519.html
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
The virErrorDomain enum has VIR_FROM_XEN, VIR_FROM_XEND,
VIR_FROM_XENSTORE, VIR_FROM_SEXPR, and VIR_FROM_XENXM. Use
these elements in the corresponding .c files. While at it,
remove the VIR_FROM_THIS define in src/xenconfig/xenxs_private.h.
The VIR_DOMAIN_EVENT_ID_MIGRATION_ITERATION event will be triggered
whenever VIR_DOMAIN_JOB_MEMORY_ITERATION changes its value, i.e.,
whenever a new iteration over guest memory pages is started during
migration.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
When acpi is used to reboot/shutdown qemu domain, qemu emits
SHUTDOWN event. Libvirt uses fakeReboot variable in order to
differentiate reboot or shutdown. fakeReboot value is reseted
to false after domain restart/reset.
When mode=agent is used to reboot qemu domain, qemu doesn't emit
SHUTDOWN event and libvirt doesn't reset fakeReboot value to false.
In this case next 'shutdown -h now' performs reboot. That's why
we don't need to set fakeReboot=true for mode=agent.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The generated output is dependent on perl hashtable ordering, which
gives different results for i686 and x86_64. Fix this by sorting
the hash keys before iterating over them
https://bugzilla.redhat.com/show_bug.cgi?id=1173641
Introduce virLeaseReadCustomLeaseFile which will populate
the new leases array with all the leases, except for expired
ones and the ones matching 'ip_to_delete'.
This removes five variables from main().
We either use the value from the environment variable, or learn it from
the existing lease file.
In the second case, the pointer would be pointing into the JSON object
of the first lease with a DUID, owned by leases_array, then
leases_array_new.
Always allocate the string instead, making obvious who should free the
string.
If dnsmasq specified DNSMASQ_IAID (so we're dealing with an IPv6
lease) but no DNSMASQ_MAC, we skip creation of the new lease object.
Also skip adding it to the leases array.
https://bugzilla.redhat.com/show_bug.cgi?id=1202350
https://bugzilla.redhat.com/show_bug.cgi?id=1265694
In order to be able to process disk storage pool's using a multipath
device to handle the partitions, libvirt_parthelper will need a way to
not automatically add a partition separator "p" to the generated device
name for each partition found. This is designed to mimic the multipath
features known as 'user_friendly_names' and custom 'alias' name.
If the part_separator attribute is set to "no", then generation of the
multipath partition name will not include the "p" partition separator
unless the source device path name ends with a number. The generated
partition names that get passed back to libvirt are processed in order
to find the device mapper multipath (dm-#) path device.
For example, device path "/dev/mapper/mpatha" would create partitions
"/dev/mapper/mpatha1", "/dev/mapper/mpatha2", etc. instead of
"/dev/mapper/mpathap1", "/dev/mapper/mpathap2", etc. If the device
path ends with a number "/dev/mapper/mpatha1", then the algorithm
to generate names "/dev/mapper/mpatha1p1", "/dev/mapper/mpatha1p2", etc.
would be utilized.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Add a new storage pool source device attribute 'part_separator=[yes|no]'
in order to allow a 'disk' storage pool using a device mapper multipath
device to not add the "p" partition separator to the generated device
name when libvirt_parthelper is run.
This will allow libvirt to find device mapper multipath devices which were
configured in /etc/multipath.conf to use 'user_friendly_names' or custom
'alias' names for the LUN.
Since we pass dummy variables @fdout and @fdoutlen into
virNetClientProgramCall() we make it alloc @fdout array (even
though it's an array of 0 elements since vitlogd can hardly pass
us some FDs at this stage). Nevertheless, it's an allocation not
followed by free():
==29385== 0 bytes in 60 blocks are definitely lost in loss record 2 of 1,009
==29385== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29385== by 0x54B99EF: virAllocN (viralloc.c:191)
==29385== by 0x56821B1: virNetClientProgramCall (virnetclientprogram.c:359)
==29385== by 0x563B304: virLogManagerDomainReadLogFile (log_manager.c:272)
==29385== by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==29385== by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==29385== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==29385== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==29385== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==29385== by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==29385== by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==29385== by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So I can observe this crasher that with freshly started daemon
(and virtlogd enabled) I am trying to startup a domain that
immediately dies (because it's said to use huge pages but I
haven't allocated a single one in the pool). Hardly reproducible
with -O0 or under valgrind. But I just got lucky:
==20469== Invalid write of size 8
==20469== at 0x4C2E99B: memcpy@GLIBC_2.2.5 (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x217EDD07: qemuProcessReadLog (qemu_process.c:1670)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
==20469== by 0x217F71A4: qemuProcessStart (qemu_process.c:5152)
==20469== by 0x21846582: qemuDomainObjStart (qemu_driver.c:7396)
==20469== by 0x218467DE: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==20469== by 0x21846845: qemuDomainCreate (qemu_driver.c:7468)
==20469== by 0x5611CD0: virDomainCreate (libvirt-domain.c:6753)
==20469== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==20469== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==20469== Address 0x27a52ad0 is 0 bytes after a block of size 5,584 alloc'd
==20469== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20469== by 0x9B8D1DB: xdr_string (in /lib64/libc-2.21.so)
==20469== by 0x563B39C: xdr_virLogManagerProtocolNonNullString (log_protocol.c:24)
==20469== by 0x563B6B7: xdr_virLogManagerProtocolDomainReadLogFileRet (log_protocol.c:123)
==20469== by 0x164B34: virNetMessageDecodePayload (virnetmessage.c:407)
==20469== by 0x5682360: virNetClientProgramCall (virnetclientprogram.c:379)
==20469== by 0x563B30E: virLogManagerDomainReadLogFile (log_manager.c:272)
==20469== by 0x217CD613: qemuDomainLogContextRead (qemu_domain.c:2485)
==20469== by 0x217EDC76: qemuProcessReadLog (qemu_process.c:1660)
==20469== by 0x217EDE1D: qemuProcessReportLogError (qemu_process.c:1696)
==20469== by 0x217EE8C1: qemuProcessWaitForMonitor (qemu_process.c:1957)
==20469== by 0x217F6636: qemuProcessLaunch (qemu_process.c:4955)
This points to memmove() in qemuProcessReadLog(). Imagine we just
read the following string from qemu:
"abc\n2016-01-18T09:40:44.022744Z qemu-system-x86_64: Error\n"
After the first pass of the while() loop in the
qemuProcessReadLog() (in which we have taken the false branch in
the if) @buf still points to the beginning of the string,
@filter_next points to the beginning of the second line. So we
start second iteration because there is yet another newline
character at the end. In this iteration @eol points to it
actually. Now, the control gets inside true branch of if(). Just
to remind you:
got = 58
filter_next = buf + 5,
eol = buf + 58.
Therefore skip = 54 which is correct. The message we want to skip
is 54 bytes long. However:
memmove(filter_next, eol + 1, (got - skip) +1);
which is
memmove(filter_next, eol + 1, 5)
is obviously wrong as there is only one byte we can access, not 5!
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When building with gcc-5 (particularly gcc-5.3.0 now) and having pdwtags
installed (package dwarves) make check fails with the following error:
$ make lock_protocol-struct
GEN lock_protocol-struct
--- lock_protocol-structs 2016-01-13 15:04:59.318809607 +0100
+++ lock_protocol-struct-t3 2016-01-13 15:05:17.703501234 +0100
@@ -26,10 +26,6 @@
virLockSpaceProtocolNonNullString name;
u_int flags;
};
-enum virLockSpaceProtocolAcquireResourceFlags {
- VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_SHARED = 1,
- VIR_LOCK_SPACE_PROTOCOL_ACQUIRE_RESOURCE_AUTOCREATE = 2,
-};
struct virLockSpaceProtocolAcquireResourceArgs {
virLockSpaceProtocolNonNullString path;
virLockSpaceProtocolNonNullString name;
Makefile:10415: recipe for target 'lock_protocol-struct' failed
make: *** [lock_protocol-struct] Error 1
That happens because without any specific options gcc doesn't keep enum
information in the resulting binary object. I managed to isolate the
parameters of gcc that caused this issue to disappear, however I
remember that they influenced the resulting binaries quite a bit and
were definitely not something we would want to add as mandatory to the
build process.
So to deal with this cleanly, let's take that enum and separate it out
to its own header file. Since it is only used in the lockd driver and
the protocol, lock_driver_lockd.h feels like a suitable name.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This was reported in bug #1298024 where r would be filled with the
return code of rbd_open().
Should rbd_snap_unprotect() fail for any reason the virReportSystemError
call would return 'Success' since rbd_open() succeeded.
https://bugzilla.redhat.com/show_bug.cgi?id=1298024
Signed-off-by: Wido den Hollander <wido@widodh.nl>
A device tree binary file specified by /domain/os/dtb element is a
read-only resource similar to kernel and initrd files. We shouldn't
restore its label when destroying a domain to avoid breaking other
domains configure with the same device tree.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Kernel/initrd files are essentially read-only shareable images and thus
should be handled in the same way. We already use the appropriate label
for kernel/initrd files when starting a domain, but when a domain gets
destroyed we would remove the labels which would make other running
domains using the same files very unhappy.
https://bugzilla.redhat.com/show_bug.cgi?id=921135
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We have this function qemuAgentNotifyEvent() which is supposed to
be called from thread pool responsible for processing qemu
monitor events. The function then should wake up other thread
that is waiting for a guest to shutdown or reboot. However, if we
have received a different error a warning is printed out. This
warning lacks info on which event is expected.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '90b721e43' moved where the virCgroupAddTask was made until
after the check for the vcpupin checks. However, in doing so it missed
an option where if the cpumap didn't exist, then the code would continue
back to the top of the current vcpu loop. The results was that the
virCgroupAddTask wouldn't be called.
Signed-off-by: John Ferlan <jferlan@redhat.com>
This reverts commit a41c00b472.
After much testing and upstream discussion this has been deemed to be
the incorrect operation since it means we no longer have any guarantee
about which resource controllers the QEMU processes in general are in.
There is no need to deny writes on a readonly mount: write still
won't be accepted, even if the user remounts the folder as RW in
the guest as qemu sets the 9p mount as ro.
This deny rule was leading to problems for example with readonly /:
The qemu process had to write to a bunch of files in / like logs,
sockets, etc. This deny rule was also preventing auditing of these
denials, making it harder to debug.
Commit id '7bf3198df' neglected to initialize deflate leading to a
possibility if model allocation/checks fail, then the VIR_FREE(deflate)
would be erroneous. Noted by Jan Tomko.
So, you try to start a domain, but before we even get to the part
where chardev part of qemu command line is generated (and
possibly missing path to unix sockets is made up) an error occurs
which results in calling qemuProcessStop. This will then try to
clean up the mess and possibly ends up calling unlink(NULL).
==8085== Thread 3:
==8085== Syscall param unlink(pathname) points to unaddressable byte(s)
==8085== at 0xA85EA57: unlink (in /lib64/libc-2.21.so)
==8085== by 0x213D3C24: qemuProcessCleanupChardevDevice (qemu_process.c:2866)
==8085== by 0x558D6B1: virDomainChrDefForeach (domain_conf.c:22924)
==8085== by 0x213DA9AE: qemuProcessStop (qemu_process.c:5326)
==8085== by 0x213DA2F2: qemuProcessStart (qemu_process.c:5190)
==8085== by 0x2142957F: qemuDomainObjStart (qemu_driver.c:7396)
==8085== by 0x214297DB: qemuDomainCreateWithFlags (qemu_driver.c:7450)
==8085== by 0x21429842: qemuDomainCreate (qemu_driver.c:7468)
==8085== by 0x5611B95: virDomainCreate (libvirt-domain.c:6753)
==8085== by 0x125D9A: remoteDispatchDomainCreate (remote_dispatch.h:3613)
==8085== by 0x125CB7: remoteDispatchDomainCreateHelper (remote_dispatch.h:3589)
==8085== by 0x568BF41: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==8085== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==8085==
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commmit fd2e3c4c used the domctl version 8 structure for version 9
in the xen_getdomaininfolist union, resulting in insufficient buffer
size (and subsequent memory corruption) for the GETDOMAININFOLIST
ioctl.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Autodeflate can be enabled/disabled for memballon device
of model 'virtio'.
xml:
<devices>
<memballoon model='virtio' autodeflate='on'/>
</devices>
qemu:
qemu -device virtio-balloon-pci,...,deflate-on-oom=on
Autodeflate cannot be enabled/disabled for running domain.
Excessive memory balloon inflation can cause invocation of OOM-killer,
when Linux is under severe memory pressure. QEMU memballoon device
has a feature to release some memory at the last moment before some
process will be get killed by OOM-killer.
Introduce a new optional balloon device attribute 'autodeflate' to
enable or disable this feature.
On every socket connect(2) attempt we were re-launching session
libvirtd, up to 100 times in 5 seconds.
This understandably caused some weird load races and intermittent
qemu:///session startup failures
https://bugzilla.redhat.com/show_bug.cgi?id=1271183
When we autolaunch libvirtd for session URIs, we spin in a retry
loop waiting for the daemon to start and the connect(2) to succeed.
However if we exceed the retry count, we don't explicitly raise an
error, which can yield a slew of different error messages elsewhere
in the code.
Explicitly raise the last connect(2) failure if we run out of retries.
- Add some debugging
- Make the loop dependent only on retries
- Make it explicit that connect(2) success exits the loop
- Invert the error checking logic
Commit 2b6f6ad introduced the virxdrdefs.h header with
common definitions to be included in the protocol files,
but logging/log_protocol.x was missed, so add it there as well.
Hopefully this fixes build on OS X.
When we are receiving data in smaller chunks it might happen that
virNetServerClientDispatchRead() will be called multiple times. And as
that happens, if it is a message that also transfer headers, we decode
the number of them every single time and, unfortunately, also allocate
the memory for them. That causes a leak, in the best scenario.
Best viewed with '-w'.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
if instanceId is NULL
When virNetDevVPortProfileGetStatus() was called with instanceId =
NULL (which is the case for all DISASSOCIATE requests in 802.1Qbh) it
would log the following error:
Could not find netlink response with expected parameters
even though the disassociate had been successfully completely. Then,
due to the fortunate coincidence of status having been initialized to
0 and then not changed when the "failure" was encountered, it would
still return a status of 0 (PORT_VDP_RESPONSE_SUCCESS), so the caller
would assume a successful operation.
This would result in a spurious log message though, and would fill in
LastErrorMessage, so that the API would return that error if it
happened during cleanup from some other error. That, in turn, would
lead to an incorrect supposition that the response to the port profile
disassociate was the cause of the failure.
During debugging, I noticed that the VF in question usually had *no
uuid* associated with it (big surprise)by the time the disassociate
completed, so the solution is *not* to send the previous instanceId
down.
This patch fixes virNetDevVPortProfileGetStatus() to only check the
VF's uuid in the status if it was given an instanceId to check against
when originally called. Otherwise it only checks that the particular
VF is present (it will be).
This does cause a slight difference in behavior - rather than
returning with status unchanged (and thus always 0) it will actually
get the IFLA_PORT_RESPONSE. This could lead to revelation of error
conditions we were previously ignoring. Or not. So far "not".
Use virDomainDefAddUSBController() to add an EHCI1+UHCI1+UHCI2+UHCI3
controller set to newly defined Q35 domains that don't have any USB
controllers defined.
This new function will add a single controller of the given model,
except the case of ich9-usb-ehci1 (the master controller for a USB2
controller set) in which case a set of related controllers will be
added (EHCI1, UHCI1, UHCI2, UHCI3). These controllers will not be
given PCI addresses, but should be otherwise ready to use.
"-1" is allowed for controller model, and means "default for this
machinetype". This matches the existing practice in
qemuDomainDefPostParse(), which always adds the default controller
with model = -1, and relies on the commandline builder to set a model
(that is wrong, but will be fixed later).
We need a virDomainDefAddController() that doesn't check for an
existing controller at the same index (since USB2 controllers must be
added in sets of 4 that are all at the same index), so rather than
duplicating the code in virDomainDefMaybeAddController(), split it
into two functions, in the process eliminating existing duplicated
code that loops through the controller list by calling
virDomainControllerFind(), which does the same thing).
The real Q35 machine puts the first USB controller set (EHCI+(UHCIx4))
on bus 0 slot 0x1D, and the 2nd USB controller set on bus 0 slot 0x1A,
so let's attempt to make the virtual machine match that for
controllers with auto-assigned addresses when possible.
Three test cases were added to assure that the proper addresses are
assigned - one with a single set of unaddressed USB controllers, one
with 3 (to grab both preferred slots plus one more), and one with the
order of the controller definitions reordered, to assure that the
auto-assignment isn't mixed up by order.
When qemuAssignDevicePCISlots() is looking for companion controllers
for a USB controller that has no PCI address specified, it initializes
a virDevicePCIAddress to 0000:00:00.0, fills it in with the
companion's address if one is found, then checks whether or not there
was a find based on slot == 0. On a system with a single PCI bus, that
is a valid way to check, because slot 0 is reserved, but on most other
PCI buses, slot 0 is not reserved, and is open for use by any
device. This patch adds a separate bool that is set when a companion
is found rather than relying on the faulty information provided with
"slot == 0".
Some of the protocol files already include handing of the missing int
types such as xdr_uint64_t, some don't. To fix it everywhere, move out
of the appropriate defines to the utils/virxdrdefs.h file and include
it where needed.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
OpenBSD uses 'struct sockpeercred' instead of 'struct ucred'. Add a
configure check that detects its presence and use if in the code that
could be compiled on OpenBSD.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
As cgroup implementation only works on Linux, it does not
make much sense to include sys/mount.h if other requirements are
not met, such as HAVE_MNTENT_H and HAVE_GETMNTENT_R.
Also, it fixes build on OpenBSD that requires to include sys/param.h
along with sys/mount.h.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com>
While this is no functional change, whole channel definition is
going to be needed very soon. Moreover, while touching this obey
const correctness rule in qemuAgentOpen() - so far it was passed
regular pointer to channel config even though the function is
expected to not change pointee at all. Pass const pointer
instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In qemu driver we listen to virtio channel events like an agent
connected to or disconnected from the guest part of socket.
However, with a little exception - when we find out that the
socket in question is the guest agent one, we connect or
disconnect guest agent which is done prior setting new state in
internal structure. Due to a bug in our code it may happen that
we got the event but failed to set it in internal structure
representing the channel.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit b22344f328 mistakenly reordered
Default-* lines. Thanks to that I noticed that we are very inconsistent
with our init scripts, so I took the liberty of synchronizing them,
updating them and making them all look shiny and new. So apart from
fixing the LSB requirements, I also fixed the ordering, specified
runlevels and fix the link to the reference specification.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We have a policy that if API may end up talking to a guest agent
it should require RW connection. We don't obey the rule in
virDomainGetTime().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This API does not change domain state. However, we have a policy
that an API talking to a guest agent requires RW access. But that
happens only if source == VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_AGENT.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Earlier commit 7140807917 forgot to deal
properly with status XMLs where we want the libvirt-internal paths to be
kept in place and not cleared, otherwise we could end up copying a NULL
string and segfaulting th daemon.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If the q35 specific disable s3/s4 setting isn't supported, fallback to
specifying the PIIX setting, which is the previous behavior. It doesn't
have any effect, but qemu will just warn about it rather than error:
qemu-system-x86_64: Warning: global PIIX4_PM.disable_s3=1 not used
qemu-system-x86_64: Warning: global PIIX4_PM.disable_s4=1 not used
Since it doesn't error, I don't think we should either, since there
may be configs in the wild that already have q35 + disable_s3/4 (via
virt-manager)
This function may be called with @dconnuri == NULL, e.g. from
virDomainMigrateToURI3() if the flags are missing
VIR_MIGRATE_PEER2PEER flag. Moreover, all later functions called
from here do wrap it into NULLSTR() so why not do the same here?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The condition was checking for UHCI (and OHCI for ppc64) availability so
that it can specify the proper device instead of legacy usb. However,
for ppc64, we don't need to check both OHCI and UHCI, but only OHCI as
that is the legacy default. The condition is so big that it was just a
matter of time when someone will make a mistake there, so let's use more
lines so that it is visible what the condition checks for.
This fixes usage of -device instead of -usb for ppc64 that supports
pci-usb-ohci and does not support piix3-usb-uhci.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1297020
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The libxl_device_nic structure supports specifying an outgoing rate
limit based on a time interval and bytes allowed per interval. In xl
config a rate limit is specified as "<RATE>/s@<INTERVAL>". INTERVAL
is optional and defaults to 50ms.
libvirt expresses outgoing limits by average (required), peak, burst,
and floor attributes in units of KB/s. This patch supports the outgoing
bandwidth limit by converting the average KB/s to bytes per interval
based on the same default interval (50ms) used by xl.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Both xm and xl config have long supported specifying vif rate
limiting, e.g.
vif = [ 'mac=00:16:3E:74:3d:76,bridge=br0,rate=10MB/s' ]
Add support for mapping rate to and from <bandwidth> in the xenconfig
parser and formatter. rate is mapped to the required 'average' attribute
of the <outbound> element, e.g.
<interface type='bridge'>
...
<bandwidth>
<outbound average='10240'/>
</bandwidth>
</interface>
Also add a unit test to check the conversion logic.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The xen sexpr config format has long supported specifying vif rate
limiting, e.g.
(device
(vif
(mac '00:16:3e:1b:b1:47')
(rate '10240KB/s')
...
)
)
Add support for mapping rate to and from <bandwidth> in the xenconfig
sexpr parser and formatter. rate is mapped to the required 'average'
attribute of the <outbound> element, e.g.
<interface type='bridge'>
...
<bandwidth>
<outbound average='10240'/>
</bandwidth>
</interface>
Also add unit tests to check the conversion logic.
This patch benefits both the old xen driver and the libxl driver.
Both drivers gain support for vif bandwidth when converting to/from
domXML and xen-sxpr. In addition, the old xen driver will now be
able to handle vif 'rate' setting when communicating with xend.
memory_dirty_rate corresponds to dirty-pages-rate in QEMU and
memory_iteration is what QEMU reports in dirty-sync-count.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The structure actually contains migration statistics rather than just
the status as the name suggests. Renaming it as
qemuMonitorMigrationStats removes the confusion.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
A migration is in "setup" state after it was "inactive" and before it
becomes "active". Let's reflect this in our migration status enum.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
This patch partially reverts previous commit 91a00424 and moves the post
parse function to xenParseSxpr. This update is required because xen
driver calls xenParseSxpr directly.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
My commit 674afcb09e moved computing the
default listen address from qemuMigrationPrepareAny to
qemuMigrationPrepareIncoming. However, I didn't notice listenAddress was
later passed to qemuMigrationStartNBDServer. Thus, it would be called
with the original value of listenAddress (NULL).
Let's add the updated listen address to qemuProcessIncomingDef and use
it when starting NBD servers.
Reported-by: Michael Chapman <mike@very.puzzling.org>
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Most of the changes to the list of active and inactive PCI devices
happen in virHostdev, where they are properly logged.
virPCIDeviceDetach() and virPCIDeviceReattach(), however, change the
inactive list as well, so they should be logging similar messages.
Instead of misusing a const string to hold up runtime allocated
data, introduce new variable @hoststr and obey const correctness.
==6879== 15 bytes in 1 blocks are definitely lost in loss record 68 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7DDF97: vasprintf (in /lib64/libc-2.21.so)
==6879== by 0x552BBC6: virVasprintfInternal (virstring.c:493)
==6879== by 0x552BCDB: virAsprintfInternal (virstring.c:514)
==6879== by 0x54FA44C: virLogHostnameString (virlog.c:468)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Once @hostname is printed into @hoststr we don't need it anymore.
==6879== 5 bytes in 1 blocks are definitely lost in loss record 10 of 1,064
==6879== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==6879== by 0xA7ED599: strdup (in /lib64/libc-2.21.so)
==6879== by 0x552C126: virStrdup (virstring.c:726)
==6879== by 0x553B13E: virGetHostnameImpl (virutil.c:720)
==6879== by 0x553B1BF: virGetHostnameQuiet (virutil.c:741)
==6879== by 0x54FA3FD: virLogHostnameString (virlog.c:462)
==6879== by 0x54FAB0F: virLogVMessage (virlog.c:645)
==6879== by 0x54FA680: virLogMessage (virlog.c:531)
==6879== by 0x54FBBF4: virLogParseOutputs (virlog.c:1130)
==6879== by 0x11CB4F: daemonSetupLogging (libvirtd.c:685)
==6879== by 0x11E137: main (libvirtd.c:1297)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If user defines a virtio channel with UNIX socket backend and doesn't
care about the path for the socket (e.g. qemu-agent channel), we still
generate it into the persistent XML. Moreover when then user renames
the domain, due to its persistent socket path saved into the per-domain
directory, it will not start. So let's forget about old generated paths
and also stop putting them into the persistent definition.
https://bugzilla.redhat.com/show_bug.cgi?id=1278068
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
If no port number was provided for a storage pool libvirt defaults to
port 6789; however, librbd/librados already default to 6789 when no port
number is provided.
In the future Ceph will switch to a new port for the Ceph monitors since
port 6789 is already assigned to a different application by IANA.
Port 6789 is assigned to SMC-HTTPS and Ceph now has port 3300 assigned as
the 'Ceph monitor' port.
In this case it is the best solution to not hardcode any port number into
libvirt and let librados handle the connection.
Only if a user specifies a different port number we pass it down to librados,
otherwise we leave it blank.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
merge
It could happen that rbd_list() returns X names, but that while
refreshing the pool one of those RBD images is removed from Ceph
through a different route then libvirt.
We do not need to error out in such case, we can simply ignore the
volume and continue.
error : volStorageBackendRBDRefreshVolInfo:289 :
failed to open the RBD image 'vol-998': No such file or directory
It could also be that one or more Placement Groups (PGs) inside Ceph
are inactive due to a system failure.
If that happens it could be that some RBD images can not be refreshed
and a timeout will be raised by librados.
error : volStorageBackendRBDRefreshVolInfo:289 :
failed to open the RBD image 'vol-893': Connection timed out
Ignore the error and continue to refresh the rest of the pool's
contents.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
It could be that we error out while the RBD image has not been
opened yet. This would cause us to call rbd_close() on pointer
which has not been initialized.
Set it to NULL by default and only close if it is not NULL.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Currently pkg build of master branch fails:
[ 300s] + /usr/lib/rpm/brp-boot-scripts
[ 300s] E: File `virtlogd' is missing `Required-Start', please add even if empty!
[ 300s] W: File `virtlogd' is missing `Required-Stop', please add even if empty!
[ 300s] E: File `virtlogd' has empty `Default-Start', please specify default runlevel(s)!
[ 300s] ERROR: found one or more broken init or boot scripts, please fix them.
[ 300s] For more information about LSB headers please read the manual
[ 300s] page of of insserv by executing the command `man 8 insserv'.
[ 300s] If you don't understand this, mailto=werner@suse.de
[ 300s] error: Bad exit status from /var/tmp/rpm-tmp.44965 (%install)
Add the required tags, fix the existing tags.
Use soft dependency "Should-Start" because virtlogd may work without network.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
The virtlogd initscript's lock file should go in /var/lock/subsys/, not
(the nonexistent) /var/log/subsys/.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Just recently, qemu forbade specifying format for sourceless
disks (qemu commit 39c4ae941ed992a3bb5). It kind of makes sense.
If there's no file to open, why specify its format. Anyway, I
have a domain like this:
<disk type='file' device='cdrom'>
<driver name='qemu' type='raw'/>
<target dev='hda' bus='ide'/>
<readonly/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
and obviously I am unable to start it. Therefore, a fix on our
side is needed too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id 'aeb1078ab' added a buildPool option and failure path which
calls virStoragePoolObjRemove, which unlocks the pool, clears the 'pool'
variable, and goto cleanup. However, at cleanup virStoragePoolObjUnlock
is called without check if pool is non NULL.
Commit id '70ffa02fc' added the data.file.append option to some
VIR_DOMAIN_CHR_TYPE_FILE cases in switch statements allowing the
code to "fall through" for the remainder of the cases. This causes
angst in code profiling tools, like Coverity since there is no break;
followed by more case conditions. Adjust the logic to be more specific
within each case.
Signed-off-by: Dmitry Mishin <dim@virtuozzo.com>
For completeness, use the VIR_TRISTATE_SWITCH_ABSENT for data.file.append
comparisons. Commit ids '70ffa02f' and '53a15aed' just went with the non
zero comparison.
We have few code samples there that are almost unreadable when formatted
because they are not indented properly. By indenting them they are
formatted as code and hence quite readable. Also adjust descriptions to
be comments and add semicolons so that the code sample looks like sample
of a working code.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
While reviewing 1b43885d17 I've noticed a virReportError()
followed by a goto endjob; without setting the correct return
value. Problem is, if block job is so fast that it's bandwidth
does not fit into ulong, an error is reported. However, by that
time @ret is already set to 1 which means success. Since the
scenario can be hardly considered successful, we should return a
value meaning error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Due to debug logs like this:
virPCIGetDeviceAddressFromSysfsLink:2432 : Attempting to resolve device path from device link '/sys/class/net/eth1/device/virtfn6'
logStrToLong_ui:2369 : Converted '0000:07:00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '07:00.7' to unsigned int 7
logStrToLong_ui:2369 : Converted '00.7' to unsigned int 0
logStrToLong_ui:2369 : Converted '7' to unsigned int 7
virPCIGetDeviceAddressFromSysfs:1947 : virPCIDeviceAddress 0000:07:00.7
virPCIGetVirtualFunctions:2554 : Found virtual function 7
printed *once for each SR-IOV Virtual Function* of a Physical Function
each time libvirt retrieved the list of VFs (so if the system has 128
VFs, there would be 900 lines of log for each call), the debug logs on
any system with a large number of VFs was dominated by "information"
that was possibly useful for debugging when the code was being
written, but is now useless for debugging of any problem on a running
system, and only serves to obscure the real useful information. This
overkill has no place in production code, so this patch removes it.
The previous error message just indicated that the desired response
couldn't be found, this patch tells what was desired, as well as
listing out the entire table that had been in the netlink response, to
give some kind of idea why it failed.
I noticed in a log file that we had failed to set a MAC address. The
log said which interface we were trying to set, but didn't give the
offending MAC address, which could have been useful in determining the
source of the problem. This patch modifies all three places in the
code that set MAC addresses to report the failed MAC as well as
interface.
This used to return 'unkown' and that was not correct.
A vol-dumpxml now returns:
<volume type='network'>
<name>image3</name>
<key>libvirt/image3</key>
<source>
</source>
<capacity unit='bytes'>10737418240</capacity>
<allocation unit='bytes'>10737418240</allocation>
<target>
<path>libvirt/image3</path>
<format type='raw'/>
</target>
</volume>
The RBD driver will now error out if a different format than RAW
is provided when creating a volume.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Valgrind complained:
==28277== 38 bytes in 1 blocks are definitely lost in loss record 298 of 957
==28277== at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==28277== by 0x82D7F57: __vasprintf_chk (in /lib64/libc-2.12.so)
==28277== by 0x52EF16A: virVasprintfInternal (stdio2.h:199)
==28277== by 0x52EF25C: virAsprintfInternal (virstring.c:514)
==28277== by 0x52B1FA9: virFileBuildPath (virfile.c:2831)
==28277== by 0x19B1947C: storageDriverAutostart (storage_driver.c:191)
==28277== by 0x19B196A7: storageStateAutoStart (storage_driver.c:307)
==28277== by 0x538527E: virStateInitialize (libvirt.c:793)
==28277== by 0x11D7CF: daemonRunStateInit (libvirtd.c:947)
==28277== by 0x52F4694: virThreadHelper (virthread.c:206)
==28277== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==28277== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Valgrind complained:
==18990== 20 (16 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 188 of 996
==18990== at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==18990== by 0x5292E9B: virAllocN (viralloc.c:191)
==18990== by 0x2221E731: qemuMigrationCookieXMLParseStr (qemu_migration.c:1012)
==18990== by 0x2221F390: qemuMigrationEatCookie (qemu_migration.c:1413)
==18990== by 0x222228CE: qemuMigrationPrepareAny (qemu_migration.c:3463)
==18990== by 0x22224121: qemuMigrationPrepareDirect (qemu_migration.c:3865)
==18990== by 0x22251C25: qemuDomainMigratePrepare3Params (qemu_driver.c:12414)
==18990== by 0x5389EE0: virDomainMigratePrepare3Params (libvirt-domain.c:5107)
==18990== by 0x1278DB: remoteDispatchDomainMigratePrepare3ParamsHelper (remote.c:5425)
==18990== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==18990== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==18990== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==18990==
==18990== 20 (16 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 189 of 996
==18990== at 0x4A057BB: calloc (vg_replace_malloc.c:593)
==18990== by 0x5292E9B: virAllocN (viralloc.c:191)
==18990== by 0x2221E731: qemuMigrationCookieXMLParseStr (qemu_migration.c:1012)
==18990== by 0x2221F390: qemuMigrationEatCookie (qemu_migration.c:1413)
==18990== by 0x222249D2: qemuMigrationRun (qemu_migration.c:4395)
==18990== by 0x22226365: doNativeMigrate (qemu_migration.c:4693)
==18990== by 0x22228E45: qemuMigrationPerform (qemu_migration.c:5553)
==18990== by 0x2225144B: qemuDomainMigratePerform3Params (qemu_driver.c:12621)
==18990== by 0x539F5D8: virDomainMigratePerform3Params (libvirt-domain.c:5206)
==18990== by 0x127305: remoteDispatchDomainMigratePerform3ParamsHelper (remote.c:5557)
==18990== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==18990== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
If we're replacing the NBD data, it's simplest to free the old object
(including the disk list) and allocate a new one.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Valgrind complained:
==23975== Conditional jump or move depends on uninitialised value(s)
==23975== at 0x22255FA6: qemuDomainGetBlockJobInfo (qemu_driver.c:16538)
==23975== by 0x538E97C: virDomainGetBlockJobInfo (libvirt-domain.c:9685)
==23975== by 0x12F740: remoteDispatchDomainGetBlockJobInfoHelper (remote.c:2834)
==23975== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==23975== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==23975== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==23975== by 0x52F515B: virThreadPoolWorker (virthreadpool.c:145)
==23975== by 0x52F4668: virThreadHelper (virthread.c:206)
==23975== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==23975== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
==23975==
==23975== Conditional jump or move depends on uninitialised value(s)
==23975== at 0x22255FB4: qemuDomainGetBlockJobInfo (qemu_driver.c:16542)
==23975== by 0x538E97C: virDomainGetBlockJobInfo (libvirt-domain.c:9685)
==23975== by 0x12F740: remoteDispatchDomainGetBlockJobInfoHelper (remote.c:2834)
==23975== by 0x53FF287: virNetServerProgramDispatch (virnetserverprogram.c:437)
==23975== by 0x540523D: virNetServerProcessMsg (virnetserver.c:135)
==23975== by 0x54052C7: virNetServerHandleJob (virnetserver.c:156)
==23975== by 0x52F515B: virThreadPoolWorker (virthreadpool.c:145)
==23975== by 0x52F4668: virThreadHelper (virthread.c:206)
==23975== by 0x6E08A50: start_thread (in /lib64/libpthread-2.12.so)
==23975== by 0x82BE93C: clone (in /lib64/libc-2.12.so)
If no matching block job is found, qemuMonitorGetBlockJobInfo returns 0
and we should not write anything to the caller-supplied
virDomainBlockJobInfo pointer.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The manpage for sysconf() suggest including unistd.h as the
function is declared there. Even though we are not hitting any
compile issues currently, let's include the correct header file
instead of relying on some hidden include chain.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So you have a libvirt volume that you want to wipe out. But lets
say that the volume is actually a file stored on a journaled
filesystem. Overwriting it with zeroes or a pattern does not mean
that corresponding physical location on the disk is overwritten
too, due to journaling. It's the same story with network based
volumes, copy-on-write filesystems, and so on. Since there is no
way that an userland application can write onto specific areas on
disk, all that we can do is document the fact.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In case of prlsdkLoadDomains fails, vzOpenDefault should
clear connection privateData pointer every time its
memory is actually freed.
Also it is not necessary to call vzConnectClose if a call
to vzOpenDefault fails, because they both make cleanup of
connection privateData.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
By default, QEMU truncates serial file on open. Sometimes, it could be weird -
for example, when we are trying to investigate some event, which occured several
restarts ago. This patch adds an ability to preserve previous content.
Signed-off-by: Dmitry Mishin <dim@virtuozzo.com>
Currently, there is no possibility for user to specify desired behaviour of
output to file - truncate or append. This patch adds an ability to explicitly
specify that user wants to preserve file's content on reopen.
Signed-off-by: Dmitry Mishin <dim@virtuozzo.com>
prlsdkCleanupBridgedNet call should be made strongly after
any actual domain deletion accurs. By doing this we avoid
any potential problems connected with second undefine call
when it is made after first one fails by some reason, and
we detect that network is already deleted.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
Currently vz driver unregisters domains when undefine is called,
which is wrong because it contradicts with expected behavior.
All vz domains are persistent, which means that when one is
defined a new bundle directory containing meta data is created.
Undefining domains in a way we do now leaves those directories
undeleted, which prevents subsequent define call for the same
domain xml. I.e. the following sequence define->undefine->define
doesn't work now.
The patch fixes the problem by calling PrlVm_Delete instead of
PrlVm_Unreg detaching all disks prior actually doing this to
prevent images deletion.
Signed-off-by: Maxim Nestratov <mnestratov@virtuozzo.com>
We want to eventually factor out the code dealing with device detaching
and reattaching, so that we can share it and make sure it's called eg.
when 'virsh nodedev-detach' is used.
For that to happen, it's important that the lists of active and inactive
PCI devices are updated every time a device changes its state.
Instead of passing NULL as the last argument of virPCIDeviceDetach() and
virPCIDeviceReattach(), pass the proper list so that it can be updated.
This replaces the virPCIKnownStubs string array that was used
internally for stub driver validation.
Advantages:
* possible values are well-defined
* typos in driver names will be detected at compile time
* avoids having several copies of the same string around
* no error checking required when setting / getting value
The names used mirror those in the
virDomainHostdevSubsysPCIBackendType enumeration.
This internal function supports, in theory, binding to a different
stub driver than the one the PCI device has been configured to use.
In practice, it is only ever called like
virPCIDeviceBindToStub(dev, dev->stubDriver);
which makes its second parameter redundant. Get rid of it, along
with the extra string copy required to support it.
Commmit df8192aa introduced admin related rename and some minor
(caused by automated approach, aka sed) and some more severe isues along with
it. First reason to revert is the inconsistency with libvirt library.
Although we deal with the daemon directly rather than with a specific
hypervisor, we still do have a connection. That being said, contributors might
get under the impression that AdmDaemonNew would spawn/start a new daemon
(since it's admin API, why not...), or AdmDaemonClose would do the exact
opposite or they might expect DaemonIsAlive report overall status of the daemon
which definitely isn't the case.
The second reason to revert this patch is renaming virt-admin client. The
client tool does not necessarily have to reflect the names of the API's it's
using in his internals. An example would be 's/vshAdmConnect/vshAdmDaemon'
where noone can be certain of what the latter function really does. The former
is quite expressive about some connection magic it performs, but the latter does
not say anything, especially when vshAdmReconnect and vshAdmDisconnect were
left untouched.
From: Ian Campbell <ian.campbell@citrix.com>
xend prior to 4.0 understands vcpus as maxvcpus and vcpu_avail
as a bit map of which cpus are online (default is all).
xend from 4.0 onwards understands maxvcpus as maxvcpus and
vcpus as the number which are online (from 0..N-1). The
upstream commit (68a94cf528e6 "xm: Add maxvcpus support")
claims that if maxvcpus is omitted then the old behaviour
(i.e. obeying vcpu_avail) is retained, but AFAICT it was not,
in this case vcpu==maxcpus==online cpus. This is good for us
because handling anything else would be fiddly.
This patch changes parsing of the virDomainDef maxvcpus and vcpus
entries to use the corresponding 'maxvcpus' and 'vcpus' settings
from xm and xl config. It also drops use of the old Xen 3.x
'vcpu_avail' setting.
The change also removes the maxvcpus limit of MAX_VIRT_VCPUS (since
maxvcpus is simply a count, not a bit mask), which is particularly
crucial on ARM where MAX_VIRT_CPUS == 1 (since all guests are
expected to support vcpu placement, and therefore only the boot
vcpu's info lives in the shared info page).
Existing tests adjusted accordingly, and new tests added for the
'maxvcpus' setting.
Although they've been present for quite a while, they weren't added
to the API definition, so add them there to make it clearer.
Currently only the RBD backend even checks for any flags.
The initial commit '74951eade' did not include the proper check for whether
any flags are supported by the driver.
Even though the driver doesn't support VIR_STORAGE_VOL_DELETE_ZEROED,
it still checks and allows the processing to continue
Also add the new VIR_STORAGE_VOL_DELETE_WITH_SNAPSHOTS since it is handled
as of commit id '3c7590e0a'.
Commit id '71ce4759' altered the cgroup processing with respect to the
call to virCgroupAddTask being moved out from lower layers into the calling
layers especially for qemu processing of emulator and vcpu threads. The
movement affected lxc insomuch as it is possible for a code path to
return a NULL cgroup *and* a 0 return status via virCgroupNewPartition
failure when virCgroupNewIgnoreError succeeded when virCgroupNewMachineManual
returns. Coverity pointed out that would cause virCgroupAddTask to core.
This patch will check for a NULL cgroup as well as the negative return
and just return the NULL cgroup to the caller (as it would have previously)
Remove use of xendConfigVersion in the s-expresion config formatter/parser
in src/xenconfig/. Adjust callers in the xen and libxl drivers accordingly.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
It has been quite some time since xend required specifying cdroms
and fds in '(image (hvm ...))'. Remove the code from the parsing
and formatting functions and fixup the associated tests.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Remove use of xendConfigVersion in the xm and xl config formatter/parsers
in src/xenconfig/. Adjust callers in the xen and libxl drivers accordingly.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=830056
Add flags handling to the virStoragePoolCreate and virStoragePoolCreateXML
API's which will allow the caller to provide the capability for the storage
pool create API's to also perform a pool build during creation rather than
requiring the additional buildPool step. This will allow transient pools
to be defined, built, and started.
The new flags are:
* VIR_STORAGE_POOL_CREATE_WITH_BUILD
Perform buildPool without any flags passed.
* VIR_STORAGE_POOL_CREATE_WITH_BUILD_OVERWRITE
Perform buildPool using VIR_STORAGE_POOL_BUILD_OVERWRITE flag.
* VIR_STORAGE_POOL_CREATE_WITH_BUILD_NO_OVERWRITE
Perform buildPool using VIR_STORAGE_POOL_BUILD_NO_OVERWRITE flag.
It is up to the backend to handle the processing of build flags. The
overwrite and no-overwrite flags are mutually exclusive.
NB:
This patch is loosely based upon code originally authored by Osier
Yang that were not reviewed and pushed, see:
https://www.redhat.com/archives/libvir-list/2012-July/msg01328.html
We only support hotplugging SCSI controllers.
The USB and virtio-serial related code was never reachable because
this function was only called for VIR_DOMAIN_CONTROLLER_TYPE_SCSI
controllers.
This reverts commit ee0d97a and parts of commits 16db8d2
and d6d54cd1.
This function calls qemuDomainAttachControllerDevice for SCSI
controllers and reports an error for all other controllers.
Move the error inside qemuDomainAttachControllerDevice and delete this
wrapper.
Commit id '71b803ac' assumed that the storage pool source device path
was required for a 'logical' pool. This resulted in a failure to start
a pool without any device path defined.
So, adjust the virStorageBackendLogicalMatchPoolSource logic to
return success if at least the pool name matches the vgs output
when no pool source device path is/are provided.
A closer review of the code shows that for the transition from paused to
running which was supposed to emit the VIR_DOMAIN_EVENT_RESUMED - no event
would be generated. Rather the event is generated when going from running
to running.
Following the 'was_running' boolean shows it is set when the domain obj
is active and the domain obj state is VIR_DOMAIN_RUNNING. So rather than
using was_running to generate the RESUMED event, use !was_running
https://bugzilla.redhat.com/show_bug.cgi?id=1270709
When a volume wipe is successful, perform a volume refresh afterwards to
update any volume data that may be used in future volume commands, such as
volume resize. For a raw file volume, a wipe could truncate the file and
a followup volume resize the capacity may fail because the volume target
allocation isn't updated to reflect the wipe activity.
The only caller always passes 0 for the extent start.
Drop the 'extent_start' parameter, as well as the mention of extents
from the function name.
Change off_t extent_length to unsigned long long wipe_len, as well as the
'remain' variable.
Return -1:
* on all failures of fdatasync. Instead of propagating -errno
all the way up to the virStorageVolWipe API, which is documented
to return 0 or -1.
* after a partial wipe. If safewrite failed, we would re-use the
non-negative return value of lseek (which should be 0 in this case,
because that's the only offset we seek to).
When the function changes the memory lock limit for the first time,
it will retrieve the current value and store it inside the
virDomainObj for the domain.
When the function is called again, if memory locking is no longer
needed, it will be able to restore the memory locking limit to its
original value.
We increase the limit before plugging in a PCI hostdev or a memory
module because some memory might need to be locked due to eg. VFIO.
Of course we should do the opposite after unplugging a device: this
was already the case for memory modules, but not for PCI hostdevs.
This function can be used to retrieve the current locked memory
limit for a process, so that the setting can be later restored.
Add a configure check for getrlimit(), which we now use.
The prlimit() function allows both getting and setting limits for
a process; expose the same functionality in our wrapper.
Add the const modifier for new_limit, in accordance with the
prototype for prlimit().
In commit 686eb7a24f, the break was not considered part of the
condition, hence breaking after first node when searching.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Instead of replicating the information (domain, bus, slot, function)
inside the virPCIDevice structure, use the already-existing
virPCIDeviceAddress structure.
For users of the module, this means that the object returned by
virPCIDeviceGetAddress() can no longer be NULL and must no longer
be freed by the caller.
Introduces support for domainGetJobStats which has the same
info as domainGetJobInfo but in a slightly different format.
Another difference is that virDomainGetJobStats can also
retrieve info on the most recently completed job. Though so
far this is only used in the source node to know if the
migration has been completed. But because we don't support
completed jobs we will deliver an error.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Introduce support for domainGetJobInfo to get info about the
ongoing job. If the job is active it will update the
timeElapsed which is computed with the "started" field added to
struct libxlDomainJobObj. For now we support just the very basic
info and all jobs have VIR_DOMAIN_JOB_UNBOUNDED (i.e. no completion
time estimation) plus timeElapsed computed.
Openstack Kilo uses the Job API to monitor live-migration
progress which is currently nonexistent in libxl driver and
therefore leads to a crash in the nova compute node. Right
now, migration doesn't use jobs in the source node and will
return VIR_DOMAIN_JOB_NONE. Though nova handles this case and
will migrate it properly instead of crashing.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1025230
Add a new helper virStorageBackendLogicalMatchPoolSource to compare the
pool's source name against the output from a 'pvs' command to list all
volume group physical volume data on the host. In addition, compare the
pool's source device list against the particular volume group's device
list to ensure the source device(s) listed for the pool match what the
was listed for the volume group.
Then for pool startup or check API's we need to call this new API in
order to ensure that the pool we're about to start or declare active
during checkPool has a valid definition vs. the running host.
Rework virStorageBackendLogicalFindPoolSources a bit to create a
helper virStorageBackendLogicalGetPoolSources that will make the
pvs call in order to generate a list of associated pv_name and vg_name's.
A future patch will make use of this for start/check processing to
ensure the storage pool source definition matches expectations.
https://bugzilla.redhat.com/show_bug.cgi?id=1025230
When determining whether a FS pool is mounted, rather than assuming that
the FS pool is mounted just because the target.path is in the mount list,
let's make sure that the FS pool source matches what is mounted
Refactor the code that builds the pool source string during the FS
storage pool mount to be a separate helper.
A future patch will use the helper in order to validate the mounted
FS matches the pool's expectation during poolCheck processing
when appropriate, of course. If the config for a domain specifies boot
order with <boot dev='blah'/> elements, e.g.:
<os>
...
<boot dev='hd'/>
<boot dev='network'/>
</os>
Then the first disk device in the config will have ",bootindex=1"
appended to its qemu commandline -device options, and the first (and
*only* the first) network interface device will get ",bootindex=2".
However, if the first network interface device is a "hostdev" device
(an SRIOV Virtual Function (VF) being assigned to the domain with
vfio), then the bootindex option will *not* be appended. This happens
because the bootindex=n option corresponding to the order of "<boot
dev='network'/>" is added to the -device for the first network device
when network device commandline args are constructed, but if it's a
hostdev network device, its commandline arg is instead constructed in
the loop for hostdevs.
This patch fixes that omission by noticing (in bootHostdevNet) if the
first network device was a hostdev, and if so passing on the proper
bootindex to the commandline generator for hostdev devices - the
result is that ",bootindex=2" will be properly appended to the first
"network" device in the config even if it is really a hostdev
(including if it is assigned from a libvirt network pool). (note that
this is only the case if there is no <bootmenu enabled='yes'/> element
in the config ("-boot menu-on" in qemu) , since the two are mutually
exclusive - when the bootmenu is enabled, the individual per-device
bootindex options can't be used by qemu, and we revert to using "-boot
order=xyz" instead).
If a greater level of control over boot order is desired (e.g., more
than one network device should be tried, or a network device other
than the first one encountered in the config), then <boot
dev='network'/> in the <os> element should not be used; instead, the
individual device elements in the config should be given a "<boot
order='n'/>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1278421
Many of the functions follow the pattern:
virSecurity.*Security.*Label
Remove the second 'Security' from the names, it should be
obvious that the virSecurity* functions deal with security
labels even without it.
Many of the functions follow the pattern:
virSecurity.*Security.*Label
Remove the second 'Security' from the names, it should be obvious
that the virSecurity* functions deal with security labels even
without it.
Many of the functions follow the pattern:
virSecurity.*Security.*Label
Remove the second 'Security' from the names, it should be obvious
that the virSecurity* functions deal with security labels even
without it.
Commit 256496e1 introduced a detection if "is locked" in error replay
from qemu monitor. Commit c4073657 fixed a memory leak, but it was
pointed out by Peter, that this could be done cleaner without
stringifing the replay.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The return value of virJSONValueToString() should be freed when
no longer needed. This is not the case after 256496e1.
==26902== 138 bytes in 2 blocks are definitely lost in loss record 1,051 of 1,239
==26902== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26902== by 0xAA5F599: strdup (in /lib64/libc-2.21.so)
==26902== by 0x552BAD9: virStrdup (virstring.c:726)
==26902== by 0x54F60A7: virJSONValueToString (virjson.c:1790)
==26902== by 0x1DF6EBB9: qemuMonitorJSONEjectMedia (qemu_monitor_json.c:2225)
==26902== by 0x1DF57A4C: qemuMonitorEjectMedia (qemu_monitor.c:1985)
==26902== by 0x1DF1EF2D: qemuDomainChangeEjectableMedia (qemu_hotplug.c:199)
==26902== by 0x1DF90314: qemuDomainChangeDiskLive (qemu_driver.c:7985)
==26902== by 0x1DF90476: qemuDomainUpdateDeviceLive (qemu_driver.c:8030)
==26902== by 0x1DF91ED7: qemuDomainUpdateDeviceFlags (qemu_driver.c:8677)
==26902== by 0x561785F: virDomainUpdateDeviceFlags (libvirt-domain.c:8559)
==26902== by 0x134210: remoteDispatchDomainUpdateDeviceFlags (remote_dispatch.h:10966)
==26902== 106 bytes in 1 blocks are definitely lost in loss record 1,033 of 1,239
==26902== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==26902== by 0xAA5F599: strdup (in /lib64/libc-2.21.so)
==26902== by 0x552BAD9: virStrdup (virstring.c:726)
==26902== by 0x54F60A7: virJSONValueToString (virjson.c:1790)
==26902== by 0x1DF6EC0C: qemuMonitorJSONEjectMedia (qemu_monitor_json.c:2227)
==26902== by 0x1DF57A4C: qemuMonitorEjectMedia (qemu_monitor.c:1985)
==26902== by 0x1DF1EF2D: qemuDomainChangeEjectableMedia (qemu_hotplug.c:199)
==26902== by 0x1DF90314: qemuDomainChangeDiskLive (qemu_driver.c:7985)
==26902== by 0x1DF90476: qemuDomainUpdateDeviceLive (qemu_driver.c:8030)
==26902== by 0x1DF91ED7: qemuDomainUpdateDeviceFlags (qemu_driver.c:8677)
==26902== by 0x561785F: virDomainUpdateDeviceFlags (libvirt-domain.c:8559)
==26902== by 0x134210: remoteDispatchDomainUpdateDeviceFlags (remote_dispatch.h:10966)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Moving tasks to cgroups implied sched_setaffinity. Changing the cpus in
a set implies the same for all tasks in the group.
The old code put the the thread into the cpuset inherited from the
machine cgroup, which allowed it to run outside of vcpupin for a short
while.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
The machine cgroup is a superset, a parent to the emulator and vcpuX
cgroups. The parent cgroup should never have any tasks directly in it.
In fact the parent cpuset might contain way more cpus than the sum of
emulatorpin and vcpupins. So putting tasks in the superset will allow
them to run outside of <cputune>.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
Firstly, there's a bug (or typo) in the only place where we call
this function: @multiqueue is set whenever @tapfdSize is greater
than zero, while in fact the condition should have been 'greater
than one'.
Then, secondly, since the condition depends on just one
variable, that we are even passing down to the function, we can
move the condition into the function and drop useless argument.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When user configures vhost-user interface and forgets to also configure
any shared memory, the search for the root cause of non-operational
interface might take unpleasantly long time. Let's enhance user
experience by emitting a warning in the logs.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1266982
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Some older systems, e.g. RHEL-6 do not have IFF_MULTI_QUEUE flag
which we use to enable multiqueue feature. Therefore one gets the
following compile error there:
CC util/libvirt_util_la-virnetdevmacvlan.lo
util/virnetdevmacvlan.c: In function 'virNetDevMacVLanTapSetup':
util/virnetdevmacvlan.c:338: error: 'IFF_MULTI_QUEUE' undeclared (first use in this function)
util/virnetdevmacvlan.c:338: error: (Each undeclared identifier is reported only once
util/virnetdevmacvlan.c:338: error: for each function it appears in.)
make[3]: *** [util/libvirt_util_la-virnetdevmacvlan.lo] Error 1
So, whenever user wants us to enable the feature on such systems,
we will just throw a runtime error instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The libvirt file system storage driver determines what file to
act on by concatenating the pool location with the volume name.
If a user is able to pick names like "../../../etc/passwd", then
they can escape the bounds of the pool. For that matter,
virStoragePoolListVolumes() doesn't descend into subdirectories,
so a user really shouldn't use a name with a slash.
Normally, only privileged users can coerce libvirt into creating
or opening existing files using the virStorageVol APIs; and such
users already have full privilege to create any domain XML (so it
is not an escalation of privilege). But in the case of
fine-grained ACLs, it is feasible that a user can be granted
storage_vol:create but not domain:write, and it violates
assumptions if such a user can abuse libvirt to access files
outside of the storage pool.
Therefore, prevent all use of volume names that contain "/",
whether or not such a name is actually attempting to escape the
pool.
This changes things from:
$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
Vol ../../../../../../etc/haha created
$ rm /etc/haha
to:
$ virsh vol-create-as default ../../../../../../etc/haha --capacity 128
error: Failed to create vol ../../../../../../etc/haha
error: Requested operation is not valid: volume name '../../../../../../etc/haha' cannot contain '/'
Signed-off-by: Eric Blake <eblake@redhat.com>