Instead of redoing the same filtering over and over everytime we need to
walk through all disks which are being migrated.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The APIs take the memory value in KiB and we store it in KiB
internally, but we cannot parse the whole ULONG_MAX range
on 64-bit systems, because virDomainParseScaledValue
needs to fit the value in bytes in an unsigned long long.
https://bugzilla.redhat.com/show_bug.cgi?id=1176739
We don't allow it in normal code, why would it need to be in the
generated one. IT also splits the line in perl code so it's readable.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since we don't have syntax-check for this, it has to be checked
manually. Let's hope this is the only place it happened.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This only affected the servers that re-exec themselves, which is only
virtlockd and it didn't do any mess, so this is mostly a clenaup.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Since 'autofill'd iothreadid entries are not written during XML format
processing, it is possible that if an iothreadid in the middle of an
autofilled list would then change it's id on a subsequent restart.
Thus during the iothreadid deletion, if we determine the delete is not
the "last" thread, then clear the autofill bit for all iothreadid's
following the one being deleted (either the first or one in the middle).
This way, iothreadid's will be printed/saved.
We have a lot of passing arguments code just to pass connection
object cause it holds jobTimeout. Taking into account that
right now this value is defined at compile time let's just
get rid of it and make arguments list more clear in many
places.
In case we later need some runtime configurable timeout
value we can provide this value through arguments
function already operate such as a parallels domain
object etc as this timeouts are operation( and thus
object) specific in practice.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@parallels.com>
As of eeb008dbfc the variable is not used anymore. Drop it.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In order not to bring in any link dependencies, bridge driver doesn't
use the usual stubs as other conditionally-built code does. However,
having the function as a macro imposes a problem with possibly unused
variables if just defined as "0". This was worked around by using
(dom=dom, iface=iface, 0) which should act like a 0 if used in a
condition. However, gcc still bugs about that, so I came up with
another way how to fix that.
Using static inline functions in the header won't collide with anything,
it fixes the bug and does one thing that the macro didn't do. It checks
whenther passed variables are pointers of compatible type. It has only
one downside, and that is that we need to either a) define it with
ATTRIBUTE_UNUSED, which needs an exception in cfg.mk or b) do something
like ignore_value(variable); in the function body. I went with the
first variant.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In the XML we have the vnc port number, but QEMU takes on command line
a vnc screen number, it's port-5900. We should fail with error message
that only ports in range [5900,65535] are valid.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1164966
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
While implementing support for SPICE, I noticed VNC passwd was
never copied to libxl_device_vfb's vnc.passwd field.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1171984https://bugzilla.redhat.com/show_bug.cgi?id=1188463
Remove the check for the source host name for iSCSI source XML processing
declaring duplicate sources when the source device path and if present the
initiator of a proposed storage pool matches an existing storage pool.
The backend iSCSI storage driver uses 'iscsiadm --mode session' to query
available iscsid target sessions. The output displayed is the IP address
and the IQN (target path) of known targets. The displayed IP address
is a resolved address based on the session --login. Additionally, iscsid
keeps track of the various ways to define the host name (IPv4 Address,
IPv6 Address, /etc/hosts, etc.) for that IQN (see output of an 'iscsiadm
--mode node'). If an incoming IQN matches and the host name provided by
libvirt is resolved to the existing IQN, then iscsid will "reuse" the
session. Although libvirt could do the same name resolution, if there
is a difference, iscsid could still declare two seemingly different sources
to be the same and not create a new session which means libvirt now has
two storage pools looking at the same source. Thus to avoid any strange
host name resolution issues, just rely on iscsid for that and do not
allow multiple pools on the same host to use the same device path (IQN).
Only perform the port number check if the incoming definition actually
provides it. Since the port number is optional we could erroneously pass
a duplicate source host check since some storage pool backends which fill
in the default port number (e.g., iSCSI and sheepdog) for the started pool.
https://bugzilla.redhat.com/show_bug.cgi?id=1220809
When cold-plugging an RNG device but something fails in
qemuDomainAssignAddresses, we will double free the RNG device.
Once a device is plugged into the domain, we should set the
device pointer to NULL to fix this issue.
...
5 0x00007fb7d180ac8a in virFree at util/viralloc.c:582
6 0x00007fb7d1895cdd in virDomainRNGDefFree at conf/domain_conf.c:19786
7 0x00007fb7d1895d99 in virDomainDeviceDefFree at conf/domain_conf.c:2022
8 0x00007fb7b92b8baf in qemuDomainAttachDeviceFlags at qemu/qemu_driver.c:8785
9 0x00007fb7d190c5d7 in virDomainAttachDeviceFlags at libvirt-domain.c:8488
10 0x00007fb7d23af9d2 in remoteDispatchDomainAttachDeviceFlags at remote_dispatch.h:2842
...
Signed-off-by: Luyao Huang <lhuang@redhat.com>
There is a lot of places, were it's pretty easy for user to enter some
characters that we need to escape to create a valid XML description.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1197580
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
The code to add device type to the commandline was identical for lsi
and other models of SCSI controllers, but was duplicated (with the
exception of a minor ordering difference of the if-else clauses) for
the two cases. This patch replaces those two with a single instance of
the code just before the if().
This patch makes qemuValideDevicePCISlotsChipsets() more consistent in
appearance by replacing several clauses of an if with the equivalent
call to qemuDomainMachineIsI440FX. The if was checking exactly the
same items, just in a slightly different order.
https://bugzilla.redhat.com/show_bug.cgi?id=1220265
Passing the return value to an enum directly is not safe. Fix this by
comparing the true integer result of virTristateSwitchTypeFromString().
Signed-off-by: Luyao Huang <lhuang@redhat.com>
For some reason, we allow a bridge name with %d in it, which we replace
with an unsigned integer to form a bridge name that does not yet exist
on the host.
Do not blindly pass it to virAsprintf if it's not the only conversion,
to prevent crashing on input like:
<network>
<name>test</name>
<forward mode='none'/>
<bridge name='virbr%d%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s'/>
</network>
Ignore any template strings that do not have exactly one %d conversion,
like we do in various drivers before calling virNetDevTapCreateInBridgePort.
Since libvirt doesn't call to update the new balloon size in qemu add
code that will handle tweaking of the size of the current balloon
statistic until qemu reports the new size using the event.
Specifying a balloon size more than the memory size of a guest isn't
something that should be rejected when parsing the XML. Truncate the
size to the maximum memory size.
Use the new domain list collection helpers to avoid going through
virDomainPtrs.
This additionally implements filter capability when called through the
api that accepts domain list filters.
Until now the virDomainListAllDomains API would lock the domain list and
then every single domain object to access and filter it. This would
potentially allow a unresponsive VM to block the whole daemon if a
*listAllDomains call would get stuck.
To avoid this problem this patch collects a list of referenced domain
objects first from the list and then unlocks it right away. The
expensive operation requiring locking of the domain object is executed
after the list lock is dropped. While a single blocked domain will still
lock up a listAllDomains call, the domain list won't be held locked and
thus other APIs won't be blocked.
Additionally this patch also fixes the lookup code, where we'd ignore
the vm->removing flag and thus potentially return domain objects that
would be deleted very soon so calling any API wouldn't make sense.
As other clients also could benefit from operating on a list of domain
objects rather than the public domain descriptors a new intermediate
API - virDomainObjListCollect - is introduced by this patch.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1181074
Extend it to a universal helper used for clearing lists of any objects.
Note that the argument type is specifically void * to allow implicit
typecasting.
Additionally add a helper that works on non-NULL terminated arrays once
we know the length.
My commit 747761a79 (v1.2.15 only) dropped this bit of logic when filling
in a default arch in the XML:
- /* First try to find one matching host arch */
- for (i = 0; i < caps->nguests; i++) {
- if (caps->guests[i]->ostype == ostype) {
- for (j = 0; j < caps->guests[i]->arch.ndomains; j++) {
- if (caps->guests[i]->arch.domains[j]->type == domain &&
- caps->guests[i]->arch.id == caps->host.arch)
- return caps->guests[i]->arch.id;
- }
- }
- }
That attempt to match host.arch is important, otherwise we end up
defaulting to i686 on x86_64 host for KVM, which is not intended.
Duplicate it in the centralized CapsLookup function.
Additionally add some testcases that would have caught this.
https://bugzilla.redhat.com/show_bug.cgi?id=1219191
https://bugzilla.redhat.com/show_bug.cgi?id=890648
So, imagine you've issued an API that involves guest agent. For
instance, you want to query guest's IP addresses. So the API acquires
QUERY_JOB, locks the guest agent and issues the agent command.
However, for some reason, guest agent replies to initial ping
correctly, but then crashes tragically while executing real command
(in this case guest-network-get-interfaces). Since initial ping went
well, libvirt thinks guest agent is accessible and awaits reply to the
real command. But it will never come. What will is a monitor event.
Our handler (processSerialChangedEvent) will try to acquire
MODIFY_JOB, which will fail obviously because the other thread that's
executing the API already holds a job. So the event handler exits
early, and the QUERY_JOB is never released nor ended.
The way how to solve this is to put flag somewhere in the monitor
internals. The flag is called @running and agent commands are issued
iff the flag is set. The flag itself is set when we connect to the
agent socket. And unset whenever we see DISCONNECT event from the
agent. Moreover, we must wake up all the threads waiting for the
agent. This is done by signalizing the condition they're waiting on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Running shutdown with mode agent on a shutoff domain gives cryptic
error message:
virsh # shutdown --mode agent gentoo
error: Failed to shutdown domain gentoo
error: Guest agent is not responding: QEMU guest agent is not connected
After this patch, the error is more clear:
virsh # shutdown --mode agent gentoo
error: Failed to shutdown domain gentoo
error: Requested operation is not valid: domain is not running
Reported-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Upping an interface for no reason and not configuring it is a cardinal sin.
With the default addrgenmode if eui64 it sticks a link-local address to the
interface. That is not good, as NetworkManager would see an address configured,
assume the interface is already configured and won't touch it iself and the
interface might stay unconfigured until the end of the days.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1124721
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Allow ccw devices to be used with multiqueues. ccw provides a one to
one relation of fds to queues and does not support the vectors option.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Coverity points out that qemuMonitorGetAllBlockStatsInfo could return a
-1 and thus not fill in 'stats' (leaving it NULL). Then the call to
qemuMonitorBlockStatsUpdateCapacity will dereference it.
Coverity complains over the [n]values pairing in virQEMUCapsFreeStringList
and rather than make a bunch if "if values" checks prior to calling, by
just adding the values check inside the free function we avoid the chance
that somehow nvalues is > 0, while values == NULL
Coverity points out it was possible to have a zero return from
qemuBuildRNGBackendProps thus not filling in 'props' and then
causing a NULL dereference on the next call.
Coverity found that xenXMConfigCacheAddFile has an error path in which
no error message and a -1 was not returned which could have resulted in
a NULL dereference in a VIR_DEBUG statement and of course an erroneous
0 value returned!
Coverity notes that ->ifname is used after the VIR_FREE done in the
code path after the call to virNetDevMacVLanDeleteWithVPortProfile
by a call to virNetDevOpenvswitchRemovePort.
Since the ->ifname will be VIR_FREE()'d eventually in virDomainNetDefFree
just remove the extraneous VIR_FREE here.
When originally added, the Openvswitch code wasn't present and checks
were made for non NULL prior to use.
Coverity complains that in the error paths both the < 0 condition and
the success path after the qemuDomainObjExitMonitor failure will end
up going to cleanup. So just use ignore_value in this error path to
resolve the complaint.
If the virStringSearch() returns a 0 (zero), then each of the uses
of the call will just jump to cleanup forgetting to free the returned
empty list. Expand the scope a bit of each use and free at cleanup.
https://bugzilla.redhat.com/show_bug.cgi?id=1176020
We had a check for the vcpu count total number in <numa>
before, however this check is not good enough. There are
some examples:
1. one of cpu id is out of maxvcpus, can set success(cpu count = 5 < 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-3,100' memory='512000' unit='KiB'/>
2. use the same cpu in 2 cell, can set success(cpu count = 8 < 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-3' memory='512000' unit='KiB'/>
<cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
3. use the same cpu in 2 cell, cannot set success(cpu count = 11 > 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-6' memory='512000' unit='KiB'/>
<cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
Add a check for numa cpus, check if duplicate use one cpu in more
than one cell.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The only version that's supported in QEMU is version 2, currently.
Fortunately, it is enabled by aarch64 automatically, so there's
nothing for us that needs to be put onto command line.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some platforms, like aarch64, don't have APIC but GIC. So there's
no reason to have <apic/> feature turned on. However, we are
still missing <gic/> feature. This commit introduces the feature
to XML parser and formatter, adds documentation and updates RNG
schema.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When migrating a domain while changing its name and using
VIR_MIGRATE_PERSIST_DEST flag, libvirt would fail to properly change the
name in the persistent definition. The inconsistency results in weird
behavior when dumping domain XML, destroying the domain, restarting
libvirtd and likely in several other situations.
Since the new name is already stored in vm->def->name, we just need to
make sure the persistent definition uses this new name too.
https://bugzilla.redhat.com/show_bug.cgi?id=1076354
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Currently we try to chown any directory passed to virDirCreate,
even if the user didn't request any explicit owner/group via the
pool/vol XML.
This causes issues with qemu:///session: try to build a pool of
a root owned directory like /tmp, and it fails trying to chown the
directory to the session user. Instead it should just leave things
as they are, unless the user requests changing permissions via
the pool XML.
Similarly this is annoying if creating a storage pool via system
libvirtd of an existing directory in user $HOME, it's now owned
by root.
The virDirCreate function is pretty convoluted, since it needs to
fork off in certain specific cases. Try to document that, to make
it clear where exactly we are changing behavior.
The current code attempts to handle this, but it only catches mkdir
failing with EEXIST. However if say trying to build /tmp for an
unprivileged qemu:///session, mkdir will fail with EPERM.
Rather than catch any errors, just don't attempt mkdir if the directory
already exists.
Set the capability based on qmp query, or qemu version. The qmp query
includes vmport with 2.2, but no longer with 2.3. It lists only
non-machine specific capabilities, so check the qemu version too until a
machine-specific query is supported.
Now that we have macros for exclusive flags and flag requirements we can
use them to cleanup the code for setvcpus and error out for all wrong
flag combination.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Inspired by commit 7e437ee7 that introduced similar macros for virsh
commands so we don't have to repeat the same code all over.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Found by Laine and discussed a bit on internal IRC.
Commit id c56fe7f1d6 added support for creating a command line to support
scsi-disk.channel.
Series was here:
http://www.redhat.com/archives/libvir-list/2012-February/msg01052.html
Which pointed to a design proposal here:
http://permalink.gmane.org/gmane.comp.emulators.libvirt/50428
Which states (in part):
Libvirt should check for the QEMU "scsi-disk.channel" property. If it
is unavailable, QEMU will only support channel=lun=0 and 0<=target<=7.
However, the check added was ensuring that bus != lun *and* bus != 0. So
if bus == lun and both were non zero, we'd never make the second check.
Changing this to an *or* check fixes the check, but still is less readable
than the just checking each for 0
Since the qemu capabilities are not initialized for offline VMs the
caller might get suboptimal error message:
$ virsh blockjob VM PATH --bandwidth 1
error: unsupported configuration: block jobs not supported with this QEMU binary
Move the checks after we make sure that the VM is alive.
Just as we allow stopping filesystem pools when they were unmounted
externally, do not fail to stop an iscsi pool when someone else
closed the session externally.
Reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1171984
The phyp driver stuffed it into a DomainDefPtr during its attachdevice
routine, but the value is never advertised via capabilities so it should
be safe to drop.
Have the phyp driver use OSTYPE_LINUX, which is what it advertises via
capabilities.
In qemuMigrationDriveMirror we can start all disk mirrors in parallel.
We wait until they are all ready, or one of them aborts.
In qemuMigrationCancelDriveMirror, we wait until all mirrors are
properly stopped. This is necessary to ensure that destination VM is
fully in sync with the (paused) source VM.
If a drive mirror can not be cancelled, then the destination is not in a
consistent state. In this case it is not safe to continue with the
migration.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The !modern code path needs to call qemuBlockJobEventProcess directly.
the modern code path will call it via qemuBlockJobSyncWait.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Other threads may be blocked in qemuBlockJobSyncWait. Ensure that
they're woken up when the domain is stopped.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
qemuBlockJobSyncBegin and qemuBlockJobSyncEnd delimit a region of code
where block job events are processed "synchronously".
qemuBlockJobSyncWait and qemuBlockJobSyncWaitWithTimeout wait for an
event generated by a block job.
The Wait* functions may be called multiple times while the synchronous
block job is active. Any pending block job event will be processed by
only when Wait* or End is called. disk->blockJobStatus is reset by
these functions, so if it is needed a pointer to a
virConnectDomainEventBlockJobStatus variable should be passed as the
last argument. It is safe to pass NULL if you do not care about the
block job status.
All functions assume the VM object is locked. The Wait* functions will
unlock the object for as long as they are waiting. They will return -1
and report an error if the domain exits before an event is received.
Typical use is as follows:
virQEMUDriverPtr driver;
virDomainObjPtr vm; /* locked */
virDomainDiskDefPtr disk;
virConnectDomainEventBlockJobStatus status;
qemuBlockJobSyncBegin(disk);
... start block job ...
if (qemuBlockJobSyncWait(driver, vm, disk, &status) < 0) {
/* domain died while waiting for event */
ret = -1;
goto error;
}
... possibly start other block jobs
or wait for further events ...
qemuBlockJobSyncEnd(driver, vm, disk, NULL);
To perform other tasks periodically while waiting for an event:
virQEMUDriverPtr driver;
virDomainObjPtr vm; /* locked */
virDomainDiskDefPtr disk;
virConnectDomainEventBlockJobStatus status;
unsigned long long timeout = 500 * 1000ull; /* milliseconds */
qemuBlockJobSyncBegin(disk);
... start block job ...
do {
... do other task ...
if (qemuBlockJobSyncWaitWithTimeout(driver, vm, disk,
timeout, &status) < 0) {
/* domain died while waiting for event */
ret = -1;
goto error;
}
} while (status == -1);
qemuBlockJobSyncEnd(driver, vm, disk, NULL);
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
We will want to use synchronous block jobs from qemu_migration as well,
so split this function out into a new source file.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The documentation states that for shallow block copy the image has to
have the same guest visible content as backing file of the current
image if the file is being reused. This condition can be achieved also
with a raw file (or a qcow without a backing file) so remove the
condition that would disallow it.
(This patch additionally fixes crash described in
https://bugzilla.redhat.com/show_bug.cgi?id=1215569 )
It would be used in qemumonitorjsontest, thus we make it non-static.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
Trying to use qemu:///session to create a storage pool pointing at
/tmp will usually fail with something like:
$ virsh pool-start tmp
error: Failed to start pool tmp
error: cannot open volume '/tmp/systemd-private-c38cf0418d7a4734a66a8175996c384f-colord.service-kEyiTA': Permission denied
If any volume in an FS pool can't be opened by the daemon, the refresh
fails, and the pool can't be used.
This causes pain for virt-install/virt-manager though. Imaging a user
downloads a disk image to /tmp. virt-manager wants to import /tmp as
a storage pool, so we can detect what disk format it is, and set the
XML correctly. However this case will likely fail as explained above.
Change the logic here to skip volumes that fail to open. This could
conceivably cause user complaints along the lines of 'why doesn't
libvirt show $ROOT-OWNED-VOLUME-FOO', but figuring that currently
the pool won't even startup, I don't think there are any current
users that care about that case.
https://bugzilla.redhat.com/show_bug.cgi?id=1103308
If you end up with a state file for a pool that no longer starts up
or refreshes correctly, the state file is never removed and adds
noise to the logs everytime libvirtd is started.
If the initial state syncing fails, delete the statefile.
After pool startup we call refreshPool(). If that fails, we leave
a stale pool state file hanging around.
Hit this trying to create a pool with qemu:///session containing
root owned files.
If we received zero iothreads from the monitor, but were perhaps
expecting to receive something, then the code was skipping the check
to ensure what's in the monitor matches our expectations. So invert
the checks to check that what we get back matches expectations and
then check there are zero iothreads returned.
Rather than have a separate routine to parse the alias of an iothread
returned from qemu in order to get the iothread_id value, parse the alias
when returning and just return the iothread_id in qemuMonitorIOThreadInfoPtr
This set of patches removes the function, changes the "char *name" to
"unsigned int" and handles all the fallout.
Build with clang fails with:
CC conf/libvirt_conf_la-domain_conf.lo
conf/domain_conf.c:13377:9: error: variable 'cpumask' is used
uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
if (!(tmp = virXMLPropString(node, "cpuset"))) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and many other similar errors regarding the 'cpuset' variable.
Fix by explicitly initializing it with NULL.
If someone has updated a network to change its bridge name, but the
network is still active (so that bridge name hasn't taken effect yet),
we still want to disallow another network from taking that new name.
Since some people use the same naming convention as libvirt for bridge
devices they create outside the context of libvirt, it is much nicer
if we check for those devices when looking for a bridge device name to
auto-assign to a new network.
We already check that any auto-assigned bridge device name for a
virtual network (e.g. "virbr1") doesn't conflict with the bridge name
for any existing libvirt network (via virNetworkSetBridgeName() in
conf/network_conf.c).
We also want to check that the name doesn't conflict with any bridge
device created on the host system outside the control of libvirt
(history: possibly due to the ploriferation of references to libvirt's
bridge devices in HOWTO documents all around the web, it is not
uncommon for an admin to manually create a bridge in their host's
system network config and name it "virbrX"). To add such a check to
virNetworkBridgeInUse() (which is called by virNetworkSetBridgeName())
we would have to call virNetDevExists() (from util/virnetdev.c); this
function calls ioctl(SIOCGIFFLAGS), which everyone on the mailing list
agreed should not be done from an XML parsing function in the conf
directory.
To remedy that problem, this patch removes virNetworkSetBridgeName()
from conf/network_conf.c and puts an identically functioning
networkBridgeNameValidate() in network/bridge_driver.c (because it's
reasonable for the bridge driver to call virNetDevExists(), although
we don't do that yet because I wanted this patch to have as close to 0
effect on function as possible).
There are a couple of inevitable changes though:
1) We no longer check the bridge name during
virNetworkLoadConfig(). Close examination of the code shows that
this wasn't necessary anyway - the only *correct* way to get XML
into the config files is via networkDefine(), and networkDefine()
will always call networkValidate(), which previously called
virNetworkSetBridgeName() (and now calls
networkBridgeNameValidate()). This means that the only way the
bridge name can be unset during virNetworkLoadConfig() is if
someone edited the config file on disk by hand (which we explicitly
prohibit).
2) Just on the off chance that somebody *has* edited the file by hand,
rather than crashing when they try to start their malformed
network, a check for non-NULL bridge name has been added to
networkStartNetworkVirtual().
(For those wondering why I don't instead call
networkValidateBridgeName() there to set a bridge name if one
wasn't present - the problem is that during
networkStartNetworkVirtual(), the lock for the network being
started has already been acquired, but the lock for the network
list itself *has not* (because we aren't adding/removing a
network). But virNetworkBridgeInuse() iterates through *all*
networks (including this one) and locks each network as it is
checked for a duplicate entry; it is necessary to lock each network
even before checking if it is the designated "skip" network because
otherwise some other thread might acquire the list lock and delete
the very entry we're examining. In the end, permitting a setting of
the bridge name during network start would require that we lock the
entire network list during any networkStartNetwork(), which
eliminates a *lot* of parallelism that we've worked so hard to
achieve (it can make a huge difference during libvirtd startup). So
rather than try to adjust for someone playing against the rules, I
choose to instead give them the error they deserve.)
3) virNetworkAllocateBridge() (now removed) would leak any "template"
string set as the bridge name. Its replacement
networkFindUnusedBridgeName() doesn't leak the template string - it
is properly freed.
Coverity notes that the switch() used to check 'connected' values has
two DEADCODE paths (_DEFAULT & _LAST). Since 'connected' is a boolean
it can only be one or the other (CONNECTED or DISCONNECTED), so it just
seems pointless to use a switch to get "all" values. Convert to if-else
Add qemuDomainAddIOThread and qemuDomainDelIOThread in order to add or
remove an IOThread to/from the host either for live or config optoins
The implementation for the 'live' option will use the iothreadpids list
in order to make decision, while the 'config' option will use the
iothreadids list. Additionally, for deletion each may have to adjust
the iothreadpin list.
IOThreads are implemented by qmp objects, the code makes use of the existing
qemuMonitorAddObject or qemuMonitorDelObject APIs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
We're about to allow IOThreads to be deleted, but an iothreadid may be
included in some domain thread sched, so add a new API to allow removing
an iothread from some entry.
Then during the writing of the threadsched data and an additional check
to determine whether the bitmap is all clear before writing it out.
With iothreadid's allowing any 'id' value for an iothread_id, the
iothreadsched code needs a slight adjustment to allow for "any"
unsigned int value in order to create the bitmap of ids that will
have scheduler adjustments. Adjusted the doc description as well.
Remove the iothreadspin array from cputune and replace with a cpumask
to be stored in the iothreadids list.
Adjust the test output because our printing goes in order of the iothreadids
list now.
Since it's only ever referenced in domain_conf.c, make the function
static, but also will need to move it to somewhere before it's referenced
rather than forward referencing it.
Add 'thread_id' to the virDomainIOThreadIDDef as a means to store the
'thread_id' as returned from the live qemu monitor data.
Remove the iothreadpids list from _qemuDomainObjPrivate and replace with
the new iothreadids 'thread_id' element.
Rather than use the default numbering scheme of 1..number of iothreads
defined for the domain, use the iothreadid's list for the iothread_id
Since iothreadids list keeps track of the iothread_id's, these are
now used in place of the many places where a for loop would "know"
that the ID was "+ 1" from the array element.
The new tests ensure usage of the <iothreadid> values for an exact number
of iothreads and the usage of a smaller number of <iothreadid> values than
iothreads that exist (and usage of the default numbering scheme).
Adding a new XML element 'iothreadids' in order to allow defining
specific IOThread ID's rather than relying on the algorithm to assign
IOThread ID's starting at 1 and incrementing to iothreads count.
This will allow future patches to be able to add new IOThreads by
a specific iothread_id and of course delete any exisiting IOThread.
Each iothreadids element will have 'n' <iothread> children elements
which will have attribute "id". The "id" will allow for definition
of any "valid" (eg > 0) iothread_id value.
On input, if any <iothreadids> <iothread>'s are provided, they will
be marked so that we only print out what we read in.
On input, if no <iothreadids> are provided, the PostParse code will
self generate a list of ID's starting at 1 and going to the number
of iothreads defined for the domain (just like the current algorithm
numbering scheme). A future patch will rework the existing algorithm
to make use of the iothreadids list.
On output, only print out the <iothreadids> if they were read in.
In a lot places we use path like this:
$(srcdir)/../src/....
when in fact it can be:
$(top_srcdir)/src/
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The lookup is just for check whether a domain we are about to add does
not already exists. Well, the virDomainObjListAdd() function does that
for us already so there's no need to duplicate the check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
use virNetworkRouteDefFree() instead of VIR_FREE to free routes, otherwise
the element 'family' would not be freed.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
use cleanup instead of error, so that the allocated strings could also get freed
when there's no error.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
virBufferContentAndReset() doesn't free buf contents, we should use
virBufferFreeAndReset() to get buf freed.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
If a user hot-attaches the guest agent channel libvirt would ignore it
until the restart of libvirtd or shutdown/destroy and start of the VM
itself.
This patch adds code that opens or closes the guest agent connection
according to the state of the guest agent channel according to
connect/disconnect events.
To allow opening the channel from the event handler qemuConnectAgent
needed to be exported.
When the guest agent channel gets hotplugged to a VM, libvirt would
still report that "QEMU guest agent is not configured" rather than
stating that the connection was not established yet.
Currently the code won't be able to connect to the agent after hotplug
but that will change in a later patch.
As the qemuFindAgentConfig() helper is quite helpful in this case move
it to a more usable place and export it.
Rearrange code so that the local variable is always initialized and
disposed.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Jim Fehlig <jfehlig@suse.com>
Commit bf32462b missed initializing sdl.opengl. Without the
initialization, libvirtd will be terminated by an assert from libxl:
Assertion `!libxl_defbool_is_default(db)' failed.
Reported-by: Olaf Hering <olaf@aepfle.de>
If the domU configu has sdl enabled libvirtd crashes:
libvirtd[5158]: libvirtd: libxl.c:343: libxl_defbool_val:
Assertion `!libxl_defbool_is_default(db)' failed.
Initialize the relevant defbool variables in libxl_device_vfb.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Jim Fehlig <jfehlig@suse.com>
Since net->model is not defined for containers we shouldn't touch it.
In case network adapter model is defined, a warning about ignoring
it is shown.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Fix for such a case:
1. Domain A and B xml contain the same SRIOV net hostdev(<interface
type='hostdev' /> with same pci address).
2. virsh start A (Successfully, and configure the SRIOV net with
custom mac)
3. virsh start B (Fail because of the hostdev used by domain A or other
reason.)
In step 3, 'virHostdevNetConfigRestore' is called for the hostdev
which is still used by domain A. It makes the mac/vlan of the SRIOV net
change.
Code Change in this fix:
1. As the pci used by other domain have been removed from
'pcidevs' in previous loop, we only restore the nic config for
the hostdev still in 'pcidevs'(used by this domain)
2. update the comments to make it more clear
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
Refactor some code to create a static function virHostdevIsPCINetDevice
which will detect whether the hostdev is a pci net device or not.
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
virDomainGetJobStats is able to report statistics of a completed
migration, however to get usable downtime and total time statistics both
hosts have to keep synchronized time. To provide at least some
estimation of the times even when NTP daemons are not running on both
hosts we can just ignore the time needed to transfer a migration cookie
to the destination host. The result will be also inaccurate but a bit
more predictable. The total/down time will just be at least what we
report.
https://bugzilla.redhat.com/show_bug.cgi?id=1213434
Commit 1268820a removed obsolete index() function and replaced it by
strchr. Few versions of gcc has a bug and reports a warning about
strchr:
../../src/util/virstring.c:1006: error: logical '&&' with non-zero
constant will always evaluate as true [-Wlogical-op]
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
ListFindByID() still requires to step through items in the hash table
(in the worst case scenario through all of them), lock each one and
compare whether we've found what we're looking for. This is suboptimal
as locking a domain object means we need to wait for the current API
running over the object to finish.
Unfortunately, we can't drop the function completely because we have
this public API virDomainLookupByID which we can't drop.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This hash table will contain the same data as already existing one.
The only difference is that while the first table uses domain uuid as
key, the new table uses domain name. This will allow much faster (and
lockless) lookups by domain name.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Every domain that grabs a domain object to work over should
reference it to make sure it won't disappear meanwhile.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is basically turning qemuDomObjEndAPI into a more general
function. Other drivers which gets a reference to domain objects may
benefit from this function too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since we haven't implemented balloon parameters tuning
we can just return amount of memory in this function.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Commit 2a530a3e5 is not portable to mingw, which intentionally
avoids declaring the obsolete index(). See also:
https://bugzilla.redhat.com/show_bug.cgi?id=1214605
* src/util/virstring.c (virStringStripControlChars): Use strchr.
Signed-off-by: Eric Blake <eblake@redhat.com>
CT stands for containers, i.e. def->os.type should be compared with VIR_DOMAIN_OSTYPE_EXE
rather than VIR_DOMAIN_OSTYPE_HVM
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Instead of each API copying the same lines of code, lets use the
generic function designed just for that purpose.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Instead of each API copying the same lines of code, lets use the
generic function designed just for that purpose. At the same time,
drop useless connection object locking in some functions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This function is practically copied over from qemu driver. Its
only purpose in life is to lookup a domain object and print an
error if no object is found.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The pointer does not change throughout the while life of a
parallels connection. Mark it as such.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
just as what b8e25c35d7 did, we
fall back to the ACPI method when the guest agent is unresponsive
in qemuDomainReboot().
Signed-off-by: YueWenyuan <yuewenyuan@huawei.com>
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Some hypervisors like Xen do not have PIDs associated with domains.
Relax the requirement for PID != 0 in the locking code so it can
be used by hypervisors that do not represent domains as a process
running on the host.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
When running on FreeBSD, there's a bug in virCommandProcessIO
polling that is triggered by the commandtest.
A test that triggers EPIPE in commandtest (named "test20") hungs
forever on FreeBSD.
Apparently, this happens because FreeBSD sets POLLHUP flag on revents
when stdin in closed. And as the current implementation only checks for
POLLOUT and POLLERR, it ends up looping forever inside
virCommandProcessIO and not trying to do one more write() that would
trigger EPIPE.
To fix that check for the POLLHUP flag along with POLLOUT and POLLERR.
When a user would specify a backing chain index that is above the start
point libvirt would report a rather unhelpful error:
invalid argument: could not find backing store 1 in chain for 'sub/link2'
This patch adds an explicit check that the index is below start point in
the backing store and reports the following error if not:
invalid argument: requested backing store index 1 is above 'sub/../qcow2' in chain for 'sub/link2'
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1177062
Some storage protocols allow to have the @path field in struct
virStorageSource set to NULL. Add NULLSTR() wrappers to handle this
possibility until I finish the storage source error formatter.
Build fails on non-Linux systems with this error:
CC util/libvirt_util_la-virnetdev.lo
util/virnetdev.c:364:1: error: unused function 'virNetDevReplaceMacAddress' [-Werror,-Wunused-function]
virNetDevReplaceMacAddress(const char *linkdev,
^
util/virnetdev.c:406:1: error: unused function 'virNetDevRestoreMacAddress' [-Werror,-Wunused-function]
virNetDevRestoreMacAddress(const char *linkdev,
^
2 errors generated.
The virNetDev{Restore,Replace}MacAddress() functions are only used
by VF-related routines that are available on Linux only. So move these
functions under the same #ifdef.
Because packets going through the egress from a bridge (where our
bandwidth limiting takes place) have no information about which
interface they came from, the QoS rules that we create instead
use the source MAC address of the packets to make their decisions
about which QDisc the packet should be in.
One flaw in this is that when a guest changed the MAC address it
used, packets from the guest would no longer be put into the
correct QDisc, but would instead be put in an "unprivileged"
class, resulting in the bandwidth "floor" (minimum guaranteed)
being no longer honored.
Now that libvirt has infrastructure to capture and respond to
RX_FILTER_CHANGE events from qemu (sent whenever a guest
interface modifies its MAC address, among other things), we can
notice when a guest MAC address changes, and update the QoS rules
accordingly, so that bandwidth floor is honored even after a
guest MAC address change.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In one of my previous patches (b68a56bcfe) I made class_id to
format more frequently. Well, now it's formatting way too
frequent - even for regular active XML. Users don't need to see
it, so lets format it only for the status XML where it's really
needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuDomainSetMemoryFlags() would allow to set the initial memory greater
than the <maxMemory> field. While the configuration would not work as
memory hotplug requires NUMA to be enabled and the
qemuDomainSetMemoryFlags() API does not work on NUMA guests this just
fixes a corner case.
The fix is still worth though as it allows to induce an invalid
configuration and make the VM vanish on libvirt restart.
Additionally this tweaks error message to be more accurate.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Introduce libxl.conf configuration file, adding the 'autoballoon'
setting as the first knob for controlling the libxl driver.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
A further fix for:
https://bugzilla.redhat.com/show_bug.cgi?id=1113474
Since there is no possibility that any type of macvtap will work if
the parent physdev it's attached to is offline, we should bring the
physdev online at the same time as the macvtap. When taking the
macvtap offline, it's also necessary to take the physdev offline for
macvtap passthrough mode (because the physdev has the same MAC address
as the macvtap device, so could potentially cause problems with
misdirected packets during migration, as outlined in commits 829770
and 879c13). We can't set the physdev offline for other macvtap modes
1) because there may be other macvtap devices attached to the same
physdev (and/or the host itself may be using the device) in the other
modes whereas passthrough mode is exclusive to one macvtap at a time,
and 2) there's no practical reason to do so anyway.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1113474
When we set the MAC address of a network device as a part of setting
up macvtap "passthrough" mode (where the domain has an emulated netdev
connected to a host macvtap device that has exclusive use of the
physical device, and sets the device MAC address to match its own,
i.e. "<interface type='direct'> <source mode='passthrough' .../>"), we
use ioctl(SIOCSIFHWADDR) giving it the name of that device. This is
true even if it is an SRIOV Virtual Function (VF).
But, when we are setting the MAC address / vlan ID of a VF in
preparation for "hostdev network" passthrough (this is where we set
the MAC address and vlan id of the VF after detaching the host net
driver and before assigning the device to the domain with PCI
passthrough, i.e. "<interface type='hostdev'>", we do the setting via
a netlink RTM_SETLINK message for that VF's Physical Function (PF),
telling it the VF# we want to change. This sets an "administratively
changed MAC" flag for that VF in the PF's driver, and from that point
on (until the PF driver is reloaded, *not* merely the VF driver) that
VF's MAC address can't be changed using ioctl(SIOCSIFHWADDR) - the
only way to change it is via the PF with RTM_SETLINK.
This means that if a VF is used for hostdev passthrough, it will have
the admin flag set, and future attempts to use that VF for macvtap
passthrough will fail.
The solution to this problem is to check if the device being used for
macvtap passthrough is actually a VF; if so, we use the netlink
RTM_SETLINK message to the PF to set the VF's mac address instead of
ioctl(SIOCSIFHWADDR) directly to the VF; if not, behavior does not
change from previously.
There are three pieces to making this work:
1) virNetDevMacVLan(Create|Delete)WithVPortProfile() now call
virNetDev(Replace|Restore)NetConfig() rather than
virNetDev(Replace|Restore)MacAddress() (simply passing -1 for VF#
and vlanid).
2) virNetDev(Replace|Restore)NetConfig() check to see if the device is
a VF. If so, they find the PF's name and VF#, allowing them to call
virNetDev(Replace|Restore)VfConfig().
3) To prevent mixups when detaching a macvtap passthrough device that
had been attached while running an older version of libvirt,
virNetDevRestoreVfConfig() is potentially given the preserved name
of the VF, and if the proper statefile for a VF can't be found in
the stateDir (${stateDir}/${pfname}_vf${vfid}),
virNetDevRestoreMacAddress() is called instead (which will look in
the file named ${stateDir}/${vfname}).
This problem has existed in every version of libvirt that has both
macvtap passthrough and interface type='hostdev'. Fortunately people
seem to use one or the other though, so it hasn't caused any real
world problem reports.
- Remove all qemu emulators
- Restart libvirtd
- Install qemu emulators
- Call 'virsh version' -> errors
The only thing that will force the qemu driver to refresh it's cached
capablities info is an explict API call to GetCapabilities.
However in the case when the initial caps lookup at driver connect didn't
find a single qemu emulator to poll, the driver is effectively useless
and really can't do anything until it's populated some qemu capabilities
info.
With the above steps, the user would have to either know about the
magic refresh capabilities call, or restart libvirtd to pick up the
changes.
Instead, this patch changes things so that every time a part of th
driver requests access to capabilities info, check to see if
we've previously seen any emulators. If not, force a refresh.
In the case of 'still no emulators found', this is still very quick, so
I can't think of a downside.
https://bugzilla.redhat.com/show_bug.cgi?id=1000116
https://bugzilla.redhat.com/show_bug.cgi?id=1171933
Adjust the processLU error returns to be a bit more logical. Currently,
the calling code cannot determine the difference between a non disk/lun
volume and a processed/found disk/lun. It can also not differentiate
between perhaps real/fatal error and one that won't necessarily stop
the code from finding other volumes.
After this patch virStorageBackendSCSIFindLUsInternal will stop processing
as soon as a "fatal" message occurs rather than continuting processing
for no apparent reason. It will also only set the *found value when
at least one of the processLU's was successful.
With the failed return, if the reason for the stop was that the pool
target path did not exist, was /dev, was /dev/, or did not start with
/dev, then iSCSI pool startup and refresh will fail.
Rather than passing/returning a pointer to a boolean to indicate that
perhaps we should try again - adjust the return of the call to return
the count of LU's found during processing, then let the caller decide
what to do with that value.
Use virStorageBackendPoolUseDevPath API to determine whether creation of
stable target path is possible for the volume.
This will differentiate a failed virStorageBackendStablePath which won't
need to be fatal. Thus, we'll add a -2 return value to differentiate that
the failure was a result of either the inability to find the symlink for
the device or failure to open the target path directory
For virStorageBackendStablePath, in order to make decisions in other code
split out the checks regarding whether the pool's target is empty, using /dev,
using /dev/, or doesn't start with /dev
Commit 70f446631f (from 2008) introduced
some functions for testing whether xend was returning correct sound
models. Those functions have long gone, but the function prototypes
remain. This commit removes the unused prototypes.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
This needs to specified in way too many places for a simple validation
check. The ostype/arch/virttype validation checks later in
DomainDefParseXML should catch most of the cases that this was covering.
This revealed that GuestDefaultEmulator was a bit buggy, capable
of returning an emulator that didn't match the passed domain type. Fix
up the test suite input to continue to pass.
This is a helper function to look up all capabilities data for all
the OS bits that are relevant to <domain>. This is
- os type
- arch
- domain type
- emulator
- machine type
This will be used to replace several functions in later commits.
But the internal API stays the same, and we just convert the value as
needed. Not useful yet, but this is the beginning step of using an enum
for ostype throughout the code.
When parsing XML, we validate the passed ostype + arch combo against
the detected hypervisor capabilities. This has led to the following
problem:
- Define x86 qemu guest
- qemu is inadvertently removed from the host
- libvirtd is restarted. fails to parse VM config since arch is removed
- 'virsh list --all' is now empty, user is wondering where their VMs went
Add a new internal flag VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS. Use
it when loading VM and snapshot configs from disk.
https://bugzilla.redhat.com/show_bug.cgi?id=1043572
If no <os><type> was specified:
before: unknown OS type no OS type
after : xml error: an os <type> must be specified
If an <os><type> is specified that's not in our capabiliities data:
before: unknown OS type: $type
after : unsupported configuration: no support found for os <type> '$type'
VIR_ERR_OS_TYPE is now unused (as it should be frankly) so drop its strings
as well to save our translators some effort.
In Parallels we do not support device name hints
aka <target dev=../> option and full-fledged device
disk device addressing through
<address type=.. controller=.. bus=.. target=.. unit=../>
and have only one index instead.
In this situation to be consistent we can only take
one-to-one mapping from some reasonable subset
of full address. Values outside this subset are
invalid to create Parallels VMs.
Reasonable mapping is default one defined in virDomainDiskDefAssignAddress.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@parallels.com>
We should return VIR_DRV_OPEN_ERROR in case
if we handle scheme in query but some
error occur. Previously we sometimes
return VIR_DRV_OPEN_DECLINE.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@parallels.com>
# virsh -c lxc:/// start helloworld
error: Failed to start domain helloworld
error: internal error: guest failed to start: Unknown
failure in libvirt_lxc startup
Return success when there are no cpuset.mems to be set,
instead of failing without setting an error.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
# virsh -c lxc:/// start helloworld
error: Failed to start domain helloworld
error: internal error: guest failed to start: Invalid value '1-3'
for 'cpuset.mems': Invalid argument
Free the cpu mask to avoid reusing it as a mem mask
in virCgroupSetCpusetMems
if virDomainNumatuneMaybeFormatNodeset does not set a mask.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1209948
So we have this bug. The virConnectGetDomainCapabilities() API
performs a couple of checks before it produces any result. One of
the checks is if the architecture requested by user can be run by
the binary (again user provided). However, the check is pretty
dumb. It merely compares if the default binary architecture
matches the one provided by user. However, a qemu binary can run
multiple architectures. For instance: qemu-system-ppc64 can run:
ppc, ppcle, ppc64, ppc64le and ppcemb. The default is ppc64, so
if user requested something else, like ppc64le, the check would
have failed without obvious reason.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When a qemu domain is to be rebooted, from outside, at libvirt
level it looks like regular shutdown. To really restart the
domain, libvirt needs to issue reset command on the monitor once
SHUTDOWN event appeared. So, in order to differentiate bare
shutdown and reboot libvirt uses a variable within domain private
data. It's called fakeReboot. When the reboot API is called, the
variable is set, but when the shutdown API is called it must be
cleared out. But it was not for every possible case. So if user
called virDomainReboot(), and there was no ACPI daemon running
inside the guest (so guest didn't initiated shutdown sequence)
and then virDomainShutdown(mode=agent) was called bad thing
happened. We remembered the fakeReboot and instead of shutting
the domain down, we just rebooted it.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is a simple wrapper around virNetDevBandwidthManipulateFilter() that
will update the desired filter on an interface (usually a network bridge)
with a new MAC address. Although, the MAC address in question usually
refers to some other interface - the one that the filter is constructed
for. Yeah, hard to parse. Thing is, our NATed network has a bridge where
some part of QoS takes place. And vNICs from guests are plugged into
the bridge. However, if a guest decides to change the MAC of its vNIC,
the corresponding qemu process emits an event which we can use to
update the QoS configuration based on the new MAC address.. However,
our QoS hierarchy is currently not notified, therefore it falls apart.
This function (when called in response to the aforementioned event)
will update our QoS hierarchy and duct tape it together again.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not only this simplifies the code a bit, it prepares the
environment for upcoming patches. The new
virNetDevBandwidthManipulateFilter() function is capable of both
removing a filter and adding a new one. At the same time! Yeah,
this is not currently used anywhere but look at the next commit
where you'll see it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently, when constructing traffic shaping rules, the ingress
filter is created without any priority specified on the command
line. This makes kernel to make up one. While this works, it
simplifies things a bit if we provide the filter priority. In
this case, since it's the root filter lets give it the highest
priority of number 1.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
After a360912179 the formatting of virDomainActualNetDefPtr was
changed a bit. However, during the function rewrite, iface's class_id
is not formatted as frequently as it could be. In fact, after rewrite
it's formatted only for iface of type VIR_DOMAIN_NET_TYPE_DIRECT where
it makes no sense and is unused. While where needed (_TYPE_NETWORK) is
not formatted at all. This makes the daemon forget it upon daemon
restart resulting in bad behaviour.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1211436
This reverts commit b7829f959b.
The previous fix was not correct. Like everywhere else, a driver is a
global variable allocated in stateInitialize function (or something
similar for stateless drivers). Later, when a driver API is called,
it's possible that the global variable is accessed and dereferenced.
Now, some drivers require root privileges because they undertake some
actions reserved only for the system admin (e.g. manipulating host
firewall). And here's the trouble, the NWFilter state initializer
exited too early when finding out it's running unprivileged, leaving
the global NWFilter driver variable uninitialized. Any subsequent
API call that tried to lock the driver resulted in dereferencing the
driver and thus crash.
On the other hand, in order to not resurrect the bug the original
commit was fixing, Let's forbid the nwfilter define in session mode.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Conflicts:
src/nwfilter/nwfilter_driver.c: Context. Code changed a bit
since 2013.
There is a possibility that we jump onto error label with @lockpath
still initialized to NULL. Here, the @lockpath should be unlink()-ed,
but passing there a NULL is not a good idea. Don't do that. In fact,
we should call unlink() only if we created the lock file successfully.
Reported-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The 802.11 interfaces can not be moved by themselves, their Phy has to move too.
If there are other interfaces, they have to move too -- hopefully it's not too
confusing. This is a less-invasive alternative to defining a new hostdev type
for PHYs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A destroy operation can take considerable time on large memory
domains due to scrubbing the domain's memory. Unlock the
virDomainObj while libxl_domain_destroy is executing.
Implement libxlDomainDestroyInternal wrapper to handle unlocking,
calling destroy, and locking. Change all callers of
libxl_domain_destroy to use the wrapper.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
A job should be acquired at the beginning of a domain destroy operation,
not at the end when cleaning up the domain. Fix two occurrences of this
late job acquisition in the libxl driver. Doing so renders
libxlDomainCleanupJob unused, so it is removed.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Let callers of libxlDomainStart decide when it is appropriate to
acquire a job on the associated virDomainObj.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Add support for HVM direct kernel boot in libxl. Also add a
test to verify domXML <-> native conversions.
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
In xl config, hvmloader is implied for hvm guests. It is not
specified with the "kernel" option like xm config. The "kernel"
option, along with "ramdisk" and "extra", is used for HVM direct
kernel boot. Instead of using "kernel" option to populate
virDomainDef object's os.loader->path, use hvmloader discovered
when gathering capabilities.
This change required fixing initialization of capabilities in
the test utils and removing 'kernel = "/usr/lib/xen/boot/hvmloader"'
from the test config files.
xl and xm differ a bit in how <os> configuration is represented.
E.g. xl config supports <os><nvram .../></os> via its "bios"
setting.
Move the xenParseOS and xenFormatOS functions from xen_common.c
and copy to xen_xl.c and xen_xm.c so they can be customized for
xm vs xl config. An unfortunate fallout is reordering of entries
in the test config files.
device_model is parsed in xenParseOS(), then later in
xenParseConfigCommon(). <emulator> is not part of <os>,
so makes sense to remove the parsing in xenParseOS().
On rhel-6 is broken gcc that reports this warning:
util/virbuffer.c:500: error: logical '&&' with non-zero constant will
always evaluate as true [-Wlogical-op]
Move the pragma directive before function virBufferEscapeString because
since commit aeb5262e this function uses 'strchr' too.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
ts.tv_nsec was off by a factor of 1000, making timeouts less than a
second in the future often expiring immediately.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Among all the monitor APIs some where checking if mon is NULL and some
were not. Since it's possible to have mon equal to NULL in case a second
call is attempted once entered the monitor. This requires that every
single API checks for the monitor.
This patch adds a macro that helps checking the state of the monitor and
either refactors existing checking code to use the macro or adds it in
case it was missing.
Rather than erroring out make the best attempt to retrieve other data if
disks are inaccessible or missing. The failure will still be logged
though.
Since the bulk stats API is called on multiple domains an error like
this makes the API unusable. This regression was introduced by commit
596a137134
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1209394
The comment is describing arguments passed to the function.
However, there's no @ifmac argument. In 955af4d4 it was replaced
with @ifmac_ptr. Unfortunately, the comment wasn't updated.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add virStringHasControlChars that checks if the string has
any control characters other than \t\r\n,
and virStringStripControlChars that removes them in-place.
Throughout the code, we have several places need to construct a path
somewhere in /sys/class/net/... They are not consistent and nearly
each code piece invents its own way how to do it. So unify this by:
1) use virNetDevSysfsFile() wherever possible
2) At least use common macro SYSFS_NET_DIR declared in virnetdev.h at
the rest of places which can't go with 1)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If a virAsprintf() within the function fails, we call VIR_FREE()
over @rundir variable and jump onto cleanup label, where it is
freed again. It doesn't hurt, but not make much sense too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit f6563bc3 introduced HMP impl of the function (so that a different
uglier function could be removed). Before the HMP code is called there's
a leftover check that the monitor is JSON which inhibits the code from
working.
https://bugzilla.redhat.com/show_bug.cgi?id=1200149
Even though we have a mutex mechanism so that two clients don't spawn
two daemons, it's not strong enough. It can happen that while one
client is spawning the daemon, the other one fails to connect.
Basically two possible errors can happen:
error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': Connection refused
or:
error: Failed to connect socket to '/home/mprivozn/.cache/libvirt/libvirt-sock': No such file or directory
The problem in both cases is, the daemon is only starting up, while we
are trying to connect (and fail). We should postpone the connecting
phase until the daemon is started (by the other thread that is
spawning it). In order to do that, create a file lock 'libvirt-lock'
in the directory where session daemon would create its socket. So even
when called from multiple processes, spawning a daemon will serialize
on the file lock. So only the first to come will spawn the daemon.
Tested-by: Richard W. M. Jones <rjones@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Two non-static functions in virjson.c were missing their export info in
libvirt_private.syms, so they couldn't be used anywhere it the code (and
that's about to get changed).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Luckily we are allocating structs as clean memory and
PTHREAD_MUTEX_INITIALIZER is "{ 0 }", so nothing happened, but it should
still be created as lockable object.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Check the proposed pool source host XML definition against existing gluster
pools to ensure the incoming definition doesn't use the same source dir and
soure host XML definition as an existing pool.
Check the proposed pool source host XML definition against existing sheepdog
pools to ensure the incoming definition doesn't use the same source host XML
definition as an existing pool.
Rather than have duplicate code doing the same check, have the netfs
matching processing code use the new virStoragePoolSourceMatchSingleHost.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create a separate iSCSI Source matching subroutine. Makes the calling
code a bit cleaner as well as sets up for future patches which need to
do better source hosts[0].name processing/checking.
As part of the effort the logic will be inverted from a multi-level
if statement to a series of single level checks for better readability
and further separation
Signed-off-by: John Ferlan <jferlan@redhat.com>
When acquiring resource via sanlock fails, we would report it as
VIR_ERR_INTERNAL_ERROR, which is not very friendly to applications using
libvirt. Moreover, the lockd driver would report the same failure as
VIR_ERR_RESOURCE_BUSY, which looks better.
Unfortunately, in sanlock driver we don't really know if acquiring the
resource failed because it was already locked or there was another
reason behind. But the end result is the same and I think using
VIR_ERR_RESOURCE_BUSY reason for all acquire failures is still better
than what we have now.
https://bugzilla.redhat.com/show_bug.cgi?id=1165119
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Commit 49ed6cff is broken on mingw and other non-linux platforms:
CCLD libvirt.la
Cannot export virNetDevSysfsFile: symbol not defined
collect2: error: ld returned 1 exit status
* src/util/virnetdev.c: Provide virNetDevSysfsFile fallback.
Signed-off-by: Eric Blake <eblake@redhat.com>
Found by ./autobuild.sh during a mingw cross-compile:
Commit 8a96e87 was not innocuous - glibc happens to leak the
definition of time() through other headers, so that even without
<sys/select.h>, virrandom.c compiled just fine. But on mingw,
we were not so lucky; <sys/select.h> was important for its side
effect of dragging in <time.h>, and we now have nothing providing
the declaration of time():
../../src/util/virrandom.c: In function 'virRandomOnceInit':
../../src/util/virrandom.c:65:5: error: implicit declaration of function 'time' [-Werror=implicit-function-declaration]
unsigned int seed = time(NULL) ^ getpid();
^
../../src/util/virrandom.c:65:5: error: nested extern declaration of 'time' [-Werror=nested-externs]
Signed-off-by: Eric Blake <eblake@redhat.com>
Changing the prototype to not have "int *index" since we'll soon be
disallowing index as a name. Curiously the original commit (a4504ac)
for the function used 'int idx' in the function - so they didn't match.
Now they do.
It is there even with -nodefaults and -no-user-config, so count with
that so we can start sparc domains.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
The variable 'last_processed_hostdev_vf' indicates index of the last
successfully configed vf. When resetvfnetconfig because of failure,
hostdevs[last_processed_hostdev_vf] should also be reset.
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
1. 'last_good_net' indicates the index of last successfully configured
net. so def->nets[last_good_net] should also be clean up if error occurs.
2. if error occurs in 'virNetDevMacVLanVPortProfileRegisterCallback'
(second 'goto err_exit' in loop), we should also do
'virNetDevVPortProfileDisassociate' cleanup for the
'virNetDevVPortProfileAssociate'(first code block in loop). So we should
consider the net is successfully configured after first code block in
loop finishes.
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
After set memory parameters for running domain, save the change to live
xml is needed otherwise it will disappear after restart libvirtd.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1211548
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Apparently for Xen-devel 'index' is a global and causes a build failure,
so just use the shortened 'idx' instead to avoid the conflict.
Signed-off-by: John Ferlan <jferlan@redhat.com>
QEMU does not abandon the mirror. The job carries on in the synchronised
phase and it might be either pivoted again or cancelled. The commit
hints that the described behavior was happening in a downstream version.
If the command returns false there are two possible options:
1) qemu did not reach the point where it would ask the block job to
pivot
2) pivotting failed in the actual qemu coroutine
If either of those would happen we return failure and reset the
condition that waits for the block job to complete. This makes the API
fail but in case where qemu would actually abandon the mirror the fact
is notified via the event and handled asynchronously.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1202704
qemuDomainBlockJobImpl become an unmaintainable mess over the years of
adding new stuff to it. This patch starts splitting up individual
functions from it until it can be killed entirely.
In bulk this will add lines of code rather than delete them but it will
be traded for maintainability.
My intention is to split qemuMonitorJSONBlockJob() into simpler separate
functions for every block job type. Since the error handling code is the
same for all block jobs, this patch extracts the code into a separate
function that will later be reused in more places.
With the new helper qemuMonitorJSONErrorIsClass we can save a few
function calls as we can extract the error object once.
Split out the function that checks the actual error class string into a
separate helper as it will be useful later and refactor
qemuMonitorJSONHasError to return bool type and remove few useless
checks.
Basically virJSONValueObjectHasKey are useless here since the next call
to virJSONValueObjectGet is checking the return value again (which can't
fail at that point). By removing the first check we save a function
call.
Previously we checked that the vcpu we are trying to set is in range of
the number of threads presented by qemu. The problem is that if the VM
is offline the count is 0. Since the condition subtracted 1 from the
count the number would overflow and the check would never trigger.
Change the condition for more sensible ones with specific error
messages.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1208434
Refactor the code to parse the vcpupin in a similar way the iothreadpin
code is now structured. This allows to get rid of some very strange
conditions and error messages.
Additionally since a existing bug
( https://bugzilla.redhat.com/show_bug.cgi?id=1208434 ) allows to add
vcpupin definitions for vcpus that don't exist, this patch makes the
parser to ignore all vcpupins that don't have a matching vCPU in the
definition rather than just offlined ones.
Defining a domain with the following config:
<domain ...>
...
<iothreads>1</iothreads>
<cputune>
<iothreadpin cpuset='1'/>
will result in the following config formatted back:
<domain type='kvm'>
...
<iothreads>1</iothreads>
<cputune>
<iothreadpin iothread='0' cpuset='1'/>
After restart the VM would vanish. Since our schema requires the
@iothread field to be present in <iothreadpin> make it required by the
code too.
When creating a snapshot with _REUSE_EXTERNAL when the pre-created image
does not directly link to the current active layer libvirt would
re-detect the backing chain incorrectly and it would not match with
qemu's view. Since the configuration is an operator mistake, document
that only the top layer image gets inserted.
Parallels Cloud Server supports block devices and virtual NIC
live attachment. I implemented that function for block devices so
OpenStack volume attachment is now works.
Signed-off-by: Alexander Burluka <aburluka@parallels.com>
OpenStack needs disk serial number setup because
nova boot --block-device-mapping command generates that param in
libvirt xml. I took QEMU libvirt driver behavior as a base.
QEMU driver skips inability to set serial and continues work.
So Parallels driver will ignore this param too and let domain
boot.
This patch adds checks for empty bitmaps right after the calls of
virBitmapParse. These only include spots where set API's are called and
where domain's XML is parsed.
Also, it partially reverts commit 983f5a which added a check for
invalid nodeset "0,^0" into virBitmapParse function. This change broke
the logic, as an empty bitmap should not cause an error.
https://bugzilla.redhat.com/show_bug.cgi?id=1210545
On arm, we probe for virtio-*-pci devices, but use their
virtio-*-device variants.
Set the capabilities based on the -device variants as well,
to make them work with qemus with the PCI devices compiled out.
When pre-creating storage for domains, we need to find corresponding
disk in the XML on the destination (domain XML may differ there, e.g.
disk is accessible under different path). For better debugging, I'm
printing all info I received on a disk. But there was a typo when
printing the disk capacity: "%lluu" instead of "%llu".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The problem with the previous implementation is,
even when qemuMigrationUpdateJobStatus() detects a migration job
has completed, it will do a sleep for 50 ms (which is unnecessary
and only adds up to the VM pause time).
Signed-off-by: Xing Lin <xinglin@cs.utah.edu>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Future IOThread setting patches would copy the code anyway, so create
and generalize the adding of pindef for the vcpu and the pinning of the
thread into their own APIs.
Add static virNetDevGetifaddrsAddress to attempt to get the interface
IP address. If getifaddrs is not supported, fall back to
virNetDevGetIPv4AddressIoctl to get the IP address.
This allows IPv6 addresses to be used for <listen type='network>
with device-backed networks.
https://bugzilla.redhat.com/show_bug.cgi?id=1192318
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Rename it to virNetDevGetIPv4AddressIoctl and make
virNetDevGetIPAddress a wrapper around it, allowing
other ways of getting the address to be implemented,
and still falling back to the old method.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
PCS doesn't store domain config in managed save state file.
It's forbidden to change config for VMs in this state.
It's possible to change config for containers, but after
restoring domain will have that new config, not a config,
which domain had at the moment of virDomainManagedSave.
So we need to handle this case differently from other states.
Let's forbid this operation, if config is changed and if it's
not changed - just do nothing.
Openstack/nova calls virDomainDefineXML on resume with
current domain config, so we can't forbid this operation
in managed save state.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We should add input devices with proper bus,
not VIR_DOMAIN_INPUT_BUS_XEN.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Handle input devices in virDomainDefParseXML properly
in case of parallels containers and VMs.
Parallels containers support only
VIR_DOMAIN_INPUT_BUS_PARALLELS. And if VNC is enabled
we should add implicit mouse and keyboard.
For VMs we should add implicit PS/2 mouse and
keyboard.
BTW, is it worth to refactor code and move
all this code to drivers, to *DomainDefPostParse
functions?
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add VIR_DOMAIN_INPUT_BUS_PARALLELS device type
to handle domain configuration properly for
parallels containers, when VNC is enabled.
When domain configuration has at least one
'graphics', there should be mouse and keyboard.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Fix function virDomainVideoDefaultType for
parallels VMs and containers. It should return
VGA for VMs and VIR_DOMAIN_VIDEO_TYPE_PARALLELS
for containers.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We support VNC for containers to have the same
interface with VMs. At this moment it just renders
linux text console.
Of course we don't pass any physical devices and
don't emulate virtual devices. Our VNC server
renders text from terminal master and sends
input events from VNC client to terminal.
So add special video type VIR_DOMAIN_VIDEO_TYPE_PARALLELS
for these pseudo-devices.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Network adapter model has no sense for container,
so we shouldn't set it to e1000 in
parallelsDomainDeviceDefPostParse.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
We handle this parameter for VMs while defining
domains, so let's get this property from PCS and
set corresponding field of virDomainNetDef in
prlsdkLoadDomains function.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Call virDomainDefAddImplicitControllers to add disk
controllers, so virDomainDef, filled by this function
will look exactly like the one returned by virDomainDefParseString.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Implement virDomainManagedSave api function. In PCS
this feature called "suspend". You can suspend VM or
CT while it is in running or paused state. And after
resuming (or starting) it will have the same state, as
before suspend.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Split function prlsdkDomainChangeState into
prlsdkDomainChangeStateLocked and prlsdkDomainChangeState.
So it can be used from places, where virDomainObj already
found and locked.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Return value of functions prlsdkStart/Kill/Stop e.t.c.
is PRL_RESULT in parallels_sdk.c and int in parallels_sdk.h.
PRL_RESULT is int, so compiler didn't report errors.
Let's fix the difference.
Signed-off-by: Dmitry Guryanov <dguryanov@parallels.com>
Future IOThread setting patches would copy the code anyway, so create
and generalize a delete cgroup and pindef for the vcpu into its own API.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Future IOThread setting patches would copy the code anyway, so create
and generalize the add the vcpu to a cgroup into its own API.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Create a new common API to replace the virCgroupNew{Vcpu|Emulator|IOThread}
API's using an emum to generate the cgroup name
Signed-off-by: John Ferlan <jferlan@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1206521
If the backend driver updates the pool available and/or allocation values,
then the storage_driver VolCreateXML, VolCreateXMLFrom, and VolDelete APIs
should not change the value; otherwise, it will appear as if the values
were "doubled" for each change. Additionally since unsigned arithmetic will
be used depending on the size and operation, either or both values could be
appear to be much larger than they should be (in the EiB range).
Currently only the disk pool updates the values, but other pools could.
Assume a "fresh" disk pool of 500 MiB using /dev/sde:
$ virsh pool-info disk-pool
...
Capacity: 509.88 MiB
Allocation: 0.00 B
Available: 509.84 MiB
$ virsh vol-create-as disk-pool sde1 --capacity 300M
$ virsh pool-info disk-pool
...
Capacity: 509.88 MiB
Allocation: 600.47 MiB
Available: 16.00 EiB
Following assumes disk backend updated to refresh the disk pool at deletion
of primary partition as well as extended partition:
$ virsh vol-delete --pool disk-pool sde1
Vol sde1 deleted
$ virsh pool-info disk-pool
...
Capacity: 509.88 MiB
Allocation: 9.73 EiB
Available: 6.27 EiB
This patch will check if the backend updated the pool values and honor that
update.
Commit id '471e1c4e' only considered updating the pool if the extended
partition was removed. As it turns out removing a primary partition
would also need to update the freeExtent list otherwise the following
sequence would fail (assuming a "fresh" disk pool for /dev/sde of 500M):
$ virsh pool-info disk-pool
...
Capacity: 509.88 MiB
Allocation: 0.00 B
Available: 509.84 MiB
$ virsh vol-create-as disk-pool sde1 --capacity 300M
$ virsh vol-delete --pool disk-pool sde1
$ virsh vol-create-as disk-pool sde1 --capacity 300M
error: Failed to create vol sde1
error: internal error: no large enough free extent
$
This patch will refresh the pool, rereading the partitions, and
return
https://bugzilla.redhat.com/show_bug.cgi?id=1073305
When creating a volume in a pool, the creation allows the 'capacity'
value to be larger than the available space in the pool. As long as
the 'allocation' value will fit in the space, the volume will be created.
However, resizing the volume checks were made with the new absolute
capacity value against existing capacity + the available space without
regard for whether the new absolute capacity was actually allocating
space or not. For example, a pool with 75G of available space creates
a volume of 10G using a capacity of 100G and allocation of 10G will succeed;
however, if the allocation used a capacity of 10G instead and then tried
to resize the allocation to 100G the code would fail to allow the backend
to try the resize.
Furthermore, when updating the pool "available" and "allocation" values,
the resize code would just "blindly" adjust them regardless of whether
space was "allocated" or just "capacity" was being adjusted. This left
a scenario whereby a resize to 100G would fail; however, a resize to 50G
followed by one to 100G would both succeed. Again, neither was adjusting
the allocation value, just the "capacity" value.
This patch adds more logic to the resize code to understand whether the
new capacity value is actually "allocating" space as well and whether it
shrinking or expanding. Since unsigned arithmatic is involved, the possibility
that we adjust the pool size values incorrectly is probable.
This patch also ensures that updates to the pool values only occur if we
actually performed the allocation.
NB: The storageVolDelete, storageVolCreateXML, and storageVolCreateXMLFrom
each only updates the pool allocation/availability values by the target
volume allocation value.
Support for drive-reopen was never present in the upstream code so we
don't need to pause the VM when doing the block pivot. Kill all the
code related to this semi-upstream artifact.
lxc-enter-namespace stopped working on recent kernels (at least 3.19+)
due to /proc/PID/ns/* file descriptors being opened RW. From outside
the namespace these can only be opened RO.
/var/run may reside on a tmpfs and we fail to create the PID file if
/var/run/lxc does not exist.
Since commit 0a8addc1, the lxc driver's state directory isn't
automatically created before starting a domain. Now, the lxc driver
makes sure the state directory exists when it initializes.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
rfc3986 states that the separator in URI path is a single slash.
Multiple slashes may potentially lead to different resources and thus we
should not remove them.
Like we are doing in qemu driver (ea576ee543), lets call
virNumaSetupMemoryPolicy() only if really needed. Problem is, if
we numa_set_membind() child, there's no way to change it from the
daemon afterwards. So any later attempts to change the pinning
will fail. But in very weird way - CGroups will be set, but due
to membind child will not allocate memory from any other node.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
131,088 bytes in 16 blocks are definitely lost in loss record 2,174 of 2,176
at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x4C2BACB: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x52A026F: virReallocN (viralloc.c:245)
by 0x52BFCB5: saferead_lim (virfile.c:1268)
by 0x52C00EF: virFileReadLimFD (virfile.c:1328)
by 0x52C019A: virFileReadAll (virfile.c:1351)
by 0x52A5D4F: virCgroupGetValueStr (vircgroup.c:763)
by 0x1DDA0DA3: qemuRestoreCgroupState (qemu_cgroup.c:805)
by 0x1DDA0DA3: qemuConnectCgroup (qemu_cgroup.c:857)
by 0x1DDB7BA1: qemuProcessReconnect (qemu_process.c:3694)
by 0x52FD171: virThreadHelper (virthread.c:206)
by 0x82B8DF4: start_thread (pthread_create.c:308)
by 0x85C31AC: clone (clone.S:113)
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Since the holdtime is not supported by VBOX SDK, it's being simulated
by sleeping before sending the key-up codes. The key-up codes are
auto-generated based on XT codeset rules (adding of 0x80 to key-down)
which results in the same behavior as for QEMU implementation.
https://bugzilla.redhat.com/show_bug.cgi?id=1198645
Once upon a time, there was a little domain. And the domain was pinned
onto a NUMA node and hasn't fully allocated its memory:
<memory unit='KiB'>2355200</memory>
<currentMemory unit='KiB'>1048576</currentMemory>
<numatune>
<memory mode='strict' nodeset='0'/>
</numatune>
Oh little me, said the domain, what will I do with so little memory.
If I only had a few megabytes more. But the old admin noticed the
whimpering, barely audible to untrained human ear. And good admin he
was, he gave the domain yet more memory. But the old NUMA topology
witch forbade to allocate more memory on the node zero. So he
decided to allocate it on a different node:
virsh # numatune little_domain --nodeset 0-1
virsh # setmem little_domain 2355200
The little domain was happy. For a while. Until bad, sharp teeth
shaped creature came. Every process in the system was afraid of him.
The OOM Killer they called him. Oh no, he's after the little domain.
There's no escape.
Do you kids know why? Because when the little domain was born, her
father, Libvirt, called numa_set_membind(). So even if the admin
allowed her to allocate memory from other nodes in the cgroups, the
membind() forbid it.
So what's the lesson? Libvirt should rely on cgroups, whenever
possible and use numa_set_membind() as the last ditch effort.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This new internal API checks if given CGroup controller is
available. It is going to be needed later when we need to make a
decision whether pin domain memory onto NUMA nodes using cpuset
CGroup controller or using numa_set_membind().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Currently we check qemuCaps before starting the block job. But qemuCaps
isn't available on a stopped domain, which means we get a misleading
error message in this case:
# virsh domstate example
shut off
# virsh blockjob example vda
error: unsupported configuration: block jobs not supported with this QEMU binary
Move the qemuCaps check into the block job so that we are guaranteed the
domain is running.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
qemuMigrationCookieAddNBD is usually called from within an async
MIGRATION_OUT or MIGRATION_IN job, so it needs to start a nested job.
(The one exception is during the Begin phase when change protection
isn't enabled, but qemuDomainObjEnterMonitorAsync will behave the same
as qemuDomainObjEnterMonitor in this case.)
This bug was encountered with a libvirt client that repeatedly queries
the disk mirroring block job info during a migration. If one of these
queries occurs just as the Perform migration cookie is baked, libvirt
crashes.
Relevant logs are as follows:
6701: warning : qemuDomainObjEnterMonitorInternal:1544 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous
[1] 6701: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block","id":"libvirt-629"}
[2] 6699: info : qemuMonitorIOWrite:503 : QEMU_MONITOR_IO_WRITE: mon=0x7fefdc004700 buf={"execute":"query-block","id":"libvirt-629"}
[3] 6704: info : qemuMonitorSend:972 : QEMU_MONITOR_SEND_MSG: mon=0x7fefdc004700 msg={"execute":"query-block-jobs","id":"libvirt-630"}
[4] 6699: info : qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_REPLY: mon=0x7fefdc004700 reply={"return": [...], "id": "libvirt-629"}
6699: error : qemuMonitorJSONIOProcessLine:211 : internal error: Unexpected JSON reply '{"return": [...], "id": "libvirt-629"}'
At [1] qemuMonitorBlockStatsUpdateCapacity sends its request, then waits
on mon->notify. At [2] the request is written out to the monitor socket.
At [3] qemuMonitorBlockJobInfo sends its request, and also waits on
mon->notify. The reply from the first request is received at [4].
However, qemuMonitorJSONIOProcessLine is not expecting this reply since
the second request hadn't completed sending. The reply is dropped and an
error is returned.
qemuMonitorIO signals mon->notify twice during its error handling,
waking up both of the threads waiting on it. One of them clears mon->msg
as it exits qemuMonitorSend; the other crashes:
qemuMonitorSend (mon=0x7fefdc004700, msg=<value optimized out>) at qemu/qemu_monitor.c:975
975 while (!mon->msg->finished) {
(gdb) print mon->msg
$1 = (qemuMonitorMessagePtr) 0x0
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
In order to change an existing domain we delete all existing devices and add
new from scratch. In case of network devices we should also delete corresponding
virtual networks (if any) before removing actual devices from xml. In the patch,
we do it by extending prlsdkDoApplyConfig with a new parameter, which stands for
old xml, and calling prlsdkDelNet every time old xml is specified.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The close callbacks hash are keyed by a UUID-string, but
virCloseCallbacksRun was attempting to remove them by raw UUID. This
patch ensures the callback entries are removed by UUID-string as well.
This bug caused problems when guest migrations were abnormally aborted:
# timeout --signal KILL 1 \
virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
Killed
# virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
error: Requested operation is not valid: domain 'example' is not being migrated
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
If a VM migration is aborted, a disk mirror may be failed by QEMU before
libvirt has a chance to cancel it. The disk->mirrorState remains at
_ABORT in this case, and this breaks subsequent mirrorings of that disk.
We should instead check the mirrorState directly and transition to _NONE
if it is already aborted. Do the check *after* aborting the block job in
QEMU to avoid a race.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
If virCloseCallbacksSet fails, qemuMigrationBegin must return NULL to
indicate an error occurred.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The destination libvirt daemon in a migration may segfault if the client
disconnects immediately after the migration has begun:
# virsh -c qemu+tls://remote/system list --all
Id Name State
----------------------------------------------------
...
# timeout --signal KILL 1 \
virsh migrate example qemu+tls://remote/system \
--verbose --compressed --live --auto-converge \
--abort-on-error --unsafe --persistent \
--undefinesource --copy-storage-all --xml example.xml
Killed
# virsh -c qemu+tls://remote/system list --all
error: failed to connect to the hypervisor
error: unable to connect to server at 'remote:16514': Connection refused
The crash is in:
1531 void
1532 qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
1533 {
1534 qemuDomainObjPrivatePtr priv = obj->privateData;
1535 qemuDomainJob job = priv->job.active;
1536
1537 priv->jobs_queued--;
Backtrace:
#0 at qemuDomainObjEndJob at qemu/qemu_domain.c:1537
#1 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2497
#2 in qemuProcessAutoDestroy at qemu/qemu_process.c:5646
#3 in virCloseCallbacksRun at util/virclosecallbacks.c:350
#4 in qemuConnectClose at qemu/qemu_driver.c:1154
...
qemuDomainRemoveInactive calls virDomainObjListRemove, which in this
case is holding the last remaining reference to the domain.
qemuDomainRemoveInactive then calls qemuDomainObjEndJob, but the domain
object has been freed and poisoned by then.
This patch bumps the domain's refcount until qemuDomainRemoveInactive
has completed. We also ensure qemuProcessAutoDestroy does not return the
domain to virCloseCallbacksRun to be unlocked in this case. There is
similar logic in bhyveProcessAutoDestroy and lxcProcessAutoDestroy
(which call virDomainObjListRemove directly).
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
==19015== 968 (416 direct, 552 indirect) bytes in 1 blocks are definitely lost in loss record 999 of 1,049
==19015== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52ADF14: virAllocVar (viralloc.c:560)
==19015== by 0x5302FD1: virObjectNew (virobject.c:193)
==19015== by 0x1DD9401E: virQEMUDriverConfigNew (qemu_conf.c:164)
==19015== by 0x1DDDF65D: qemuStateInitialize (qemu_driver.c:666)
==19015== by 0x53E0823: virStateInitialize (libvirt.c:777)
==19015== by 0x11E067: daemonRunStateInit (libvirtd.c:905)
==19015== by 0x53201AD: virThreadHelper (virthread.c:206)
==19015== by 0xA1EE1F2: start_thread (in /lib64/libpthread-2.19.so)
==19015== by 0xA4EFC8C: clone (in /lib64/libc-2.19.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
==19015== 8 bytes in 1 blocks are definitely lost in loss record 34 of 1,049
==19015== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x4C2C32F: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52AD888: virReallocN (viralloc.c:245)
==19015== by 0x52AD97E: virExpandN (viralloc.c:294)
==19015== by 0x52ADC51: virInsertElementsN (viralloc.c:436)
==19015== by 0x5335864: virDomainVirtioSerialAddrSetAddController (domain_addr.c:816)
==19015== by 0x53358E0: virDomainVirtioSerialAddrSetAddControllers (domain_addr.c:839)
==19015== by 0x1DD5513B: qemuDomainAssignVirtioSerialAddresses (qemu_command.c:1422)
==19015== by 0x1DD55A6E: qemuDomainAssignAddresses (qemu_command.c:1711)
==19015== by 0x1DDA5818: qemuProcessStart (qemu_process.c:4616)
==19015== by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015== by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
==19015== 1,064 (656 direct, 408 indirect) bytes in 2 blocks are definitely lost in loss record 1,002 of 1,049
==19015== at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015== by 0x52AD74B: virAlloc (viralloc.c:144)
==19015== by 0x52B47CA: virCgroupNew (vircgroup.c:1057)
==19015== by 0x52B53E5: virCgroupNewVcpu (vircgroup.c:1451)
==19015== by 0x1DD85A40: qemuSetupCgroupForVcpu (qemu_cgroup.c:1013)
==19015== by 0x1DDA66EA: qemuProcessStart (qemu_process.c:4844)
==19015== by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015== by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
==19015== by 0x1DDF1ACD: qemuDomainCreate (qemu_driver.c:7337)
==19015== by 0x53F87EA: virDomainCreate (libvirt-domain.c:6820)
==19015== by 0x12690A: remoteDispatchDomainCreate (remote_dispatch.h:3481)
==19015== by 0x126827: remoteDispatchDomainCreateHelper (remote_dispatch.h:3457)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The 'checkPool' callback was originally part of the storageDriverAutostart function,
but the pools need to be checked earlier during initialization phase,
otherwise we can't start a domain which mounts a volume after the
libvirtd daemon restarted. This is because qemuProcessReconnect is called
earlier than storageDriverAutostart. Therefore the 'checkPool' logic has been
moved to storagePoolUpdateAllState which is called inside storageDriverInitialize.
We also need a valid 'conn' reference to be able to execute 'refreshPool'
during initialization phase. Though it isn't available until storageDriverAutostart
all of our storage backends do ignore 'conn' pointer, except for RBD,
but RBD doesn't support 'checkPool' callback, so it's safe to pass
conn = NULL in this case.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1177733
This patch introduces new virStorageDriverState element stateDir.
Also adds necessary changes to storageStateInitialize, so that
directories initialization becomes more generic.
The nodedev-detach can report the name of the domain using the device
just the way nodedev-reattach does it.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
The error output of snapshot-revert should be more friendly.
There is no need to show virDomainRevertToSnapshot to user.
virReportError already includes __FUNCTION__ information in a
separate member of the struct, so repeating it in the message is
redundant and leads to situations where higher level code ends up
reporting the lower level name. We correctly converted the error
output making it more succinct and user-friendly.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1086726
Before this patch, when connected via vCenter, the free memory returned
was from the resorcePool (usually a cluster). This is in conflict with
e.g esxNodeGetInfo which always pulls info from the ESX host.
Since libvirt ESX driver works primarily with ESX hosts, this patch
changes esxNodeGetFreeMemory to pull that information from ESX host so
it's consistent with behavior of esxNodeGetInfo.
Introduce virStoragePoolSaveState to properly format the state XML in
the same manner as virStoragePoolDefFormat, except for adding a
<poolstate> ... </poolstate> around the definition. This is similar to
virNetworkObjFormat used to save the live/active network information.
When modifying config/status XML, it might be handy to include some
additional XML elements (e.g. <poolstate>). In order to do so,
introduce new formatting function virStoragePoolDefFormatBuf and make
virStoragePoolDefFormat call it.
Recent testing on large memory systems revealed a bug in the Xen xl
tool's freemem() function. When autoballooning is enabled, freemem()
is used to ensure enough memory is available to start a domain,
ballooning dom0 if necessary. When ballooning large amounts of memory
from dom0, freemem() would exceed its self-imposed wait time and
return an error. Meanwhile, dom0 continued to balloon. Starting the
domain later, after sufficient memory was ballooned from dom0, would
succeed. The libvirt implementation in libxlDomainFreeMem() suffers
the same bug since it is modeled after freemem().
In the end, the best place to fix the bug on the Xen side was to
slightly change the behavior of libxl_wait_for_memory_target().
Instead of failing after caller-provided wait_sec, the function now
blocks as long as dom0 memory ballooning is progressing. It will return
failure only when more memory is needed to reach the target and wait_sec
have expired with no progress being made. See xen.git commit fd3aa246.
There was a dicussion on how this would affect other libxl apps like
libvirt
http://lists.xen.org/archives/html/xen-devel/2015-03/msg00739.html
If libvirt containing this patch was build against a Xen containing
the old libxl_wait_for_memory_target() behavior, libxlDomainFreeMem()
will fail after 30 sec and domain creation will be terminated.
Without this patch and with old libxl_wait_for_memory_target() behavior,
libxlDomainFreeMem() does not succeed after 30 sec, but returns success
anyway. Domain creation continues resulting in all sorts of fun stuff
like cpu soft lockups in the guest OS. It was decided to properly fix
libxl_wait_for_memory_target(), and if anything improve the default
behavior of apps using the freemem reference impl in xl.
xl was patched to accommodate the change in libxl_wait_for_memory_target()
with xen.git commit 883b30a0. This patch does the same in the libxl
driver. While at it, I changed the logic to essentially match
freemem() in $xensrc/tools/libxl/xl_cmdimpl.c. It was a bit cleaner
IMO and will make it easier to spot future, potentially interesting
divergences.
Dependant is flagged as wrong in US dictionary (only valid in UK
dictionary, and even then, it has only the financial sense and not the
inter-relatedness sense that we are more prone to be wanting throughout
code).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
'virPCIDeviceList' is actually an array. Removing one element makes the
rest of the element move.
Use while loop, increase index only when not virPCIDeviceListDel(pcidevs, dev)
Signed-off-by: Huanle Han <hanxueluo@gmail.com>
Instead of always using controller 0 and incrementing port number,
respect the maximum port numbers of controllers and use all of them.
Ports for virtio consoles are quietly reserved, but not formatted
(neither in XML nor on QEMU command line).
Also rejects duplicate virtio-serial addresses.
https://bugzilla.redhat.com/show_bug.cgi?id=890606https://bugzilla.redhat.com/show_bug.cgi?id=1076708
Test changes:
* virtio-auto.args
Filling out the port when just the controller is specified.
switched from using
maxport + 1
to:
first free port on the controller
* virtio-autoassign.args
Filling out the address when no <address> is specified.
Started using all the controllers instead of 0, also discards
the bus value.
* xml -> xml output of virtio-auto
The port assignment is no longer done as a part of XML parsing,
so the unspecified values stay 0.
Create a sorted array of virtio-serial controllers.
Each of the elements contains the controller index
and a bitmap of available ports.
Buses are not tracked, because they aren't supported by QEMU.
If the call to virStorageBackendISCSIGetHostNumber failed, we set
retval = -1, but yet still called virStorageBackendSCSIFindLUs.
Need to add a goto cleanup - while at it, adjust the logic to
initialize retval to -1 and only changed to 0 (zero) on success.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Don't supercede the error message virStorageBackendSCSIFindLUs as the
message such as "error: Failed to find LUs on host 60: ..." is not overly
clear as to what the real problem might be.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Make XML definition saving more generic by moving the common code into
virStoragePoolSaveXML and leave case specific code to
PoolSave{Status,Config,...} functions.
In order to be able to use 'checkPool' inside functions which do not
have any connection reference, 'conn' attribute needs to be discarded
from the checkPool's signature, since it's not used by any storage backend
anyway.
https://bugzilla.redhat.com/show_bug.cgi?id=1206479
As described in virDomainBlockCopy() parameters description, the
VIR_DOMAIN_BLOCK_COPY_GRANULARITY parameter may require the value to
have some specific attributes (e.g. be a power of two or fall within a
certain range). And in qemu, a power of two is required. However, our
code does not check that and let qemu operation fail. Moreover, the
virsh man page is not as exact as it could be in this respect.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When we shutdown/reboot a guest using agent-mode, if the guest itself blocks infinitely,
libvirt would block in qemuAgentShutdown() forever.
Thus, we set a timeout for shutdown/reboot, from our experience, 60 seconds would be fine.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
virDomainHasDiskMirror() currently detects only jobs that add the mirror
elements. Since some operations like migration are interlocked by
existing block jobs on the given domain the check needs to be
instrumented to check regular jobs too.
This patch renames virDomainHasDiskMirror to virDomainHasDiskBlockjob
and adds an argument that allows to select that it returns true only for
block copy jobs as those interlock making the domain persistent.
Other two uses trigger on any block job type.
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
If any disk of a VM was involved in a (copy) block job we refused to do
a snapshot. As not only copy jobs interlock snapshots and the
interlocking is applicable to individual disks only we can make the
check in a more individual fashion and interlock all block job types
supported by libvirt.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1203628
In the order of appearance:
* MAX_LISTEN - never used
added by 23ad665c (qemud) and addec57 (lock daemon)
* NEXT_FREE_CLASS_ID - never used, added by 07d1b6b
* virLockError - never used, added by eb8268a4
* OPENVZ_MAX_ARG, CMDBUF_LEN, CMDOP_LEN
unused since the removal of ADD_ARG_LIT in d8b31306
* QEMU_NB_PER_CPU_STAT_PARAM - unused since 897808e
* QEMU_CMD_PROMPT, QEMU_PASSWD_PROMPT - unused since 1dc10a7
* TEST_MODEL_WORDSIZE - unused since c25c18f7
* TEMPDIR - never used, added by 714bef5
* NSIG - workaround around old headers
added by commit 60ed1d2
unused since virExec was moved by commit 02e8691
* DO_TEST_PARSE - never used, added by 9afa006
* DIFF_MSEC, GETTIMEOFDAY - unused since eee6eb6
Two places would call to qemuPrepareCpumap() with priv->autoNodeset to
convert it to a cpuset. Remove the function and use the prepared cpuset
automatically.
When the default cpuset or automatic numa placement is used libvirt
would place the whole parent cgroup in the specified cpuset. This then
disallowed to re-pin the vcpus to a different cpu.
This patch pins only the vcpu threads to the default cpuset and thus
allows to re-pin them later.
The following config would fail to start:
<domain type='kvm'>
...
<vcpu placement='static' cpuset='0-1' current='2'>4</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='2-3'/>
...
This is a regression since a39f69d2b.
When the synchronous pivot option is selected, libvirt would not update
the backing chain until the job was exitted. Some applications then
received invalid data as their job serialized first.
This patch removes polling to wait for the ABORT/PIVOT job completion
and replaces it with a condition. If a synchronous operation is
requested the update of the XML is executed in the job of the caller of
the synchronous request. Otherwise the monitor event callback uses a
separate worker to update the backing chain with a new job.
This is a regression since 1a92c71910
When the ABORT job is finished synchronously you get the following call
stack:
#0 qemuBlockJobEventProcess
#1 qemuDomainBlockJobImpl
#2 qemuDomainBlockJobAbort
#3 virDomainBlockJobAbort
While previously or while using the _ASYNC flag you'd get:
#0 qemuBlockJobEventProcess
#1 processBlockJobEvent
#2 qemuProcessEventHandler
#3 virThreadPoolWorker
Later on I'll be adding a condition that will allow to synchronise a
SYNC block job abort. The approach will require this code to be called
from two different places so it has to be extracted into a helper.
Commit 1a92c719 moved code to handle block job events to a different
function that is executed in a separate thread. The caller of
processBlockJob handles locking and unlocking of @vm, so the we should
not do it in the function itself.
The block copy API takes the speed in bytes/s rather than MiB/s that was
the prior approach in virDomainBlockRebase. We correctly converted the
speed to bytes/s in the old API but we still called the common helper
virDomainBlockCopyCommon with the unadjusted variable.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1207122
When getting info on NUMA parameters for domain,
virCgroupGetCpusetMems() may be called. However, as of 43b67f2e
the call is guarded by check if memory controller is present.
Even though it may be not obvious instantly, NUMA parameters are
stored under cpuset controller. Therefore the check needs to look
like this:
if (!virCgroupHasController(priv->cgroup,
VIR_CGROUP_CONTROLLER_CPUSET) ||
virCgroupGetCpusetMems(priv->cgroup, &nodeset) < 0) {
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Throughout our code, the virCgroupController enum is used in two ways.
First as an index to an array of cgroup controllers:
struct virCgroup {
char *path;
struct virCgroupController controllers[VIR_CGROUP_CONTROLLER_LAST];
};
Second way is that when calling virCgroupNew() a bitmask of the enum
items can be passed to selectively detect only some controllers. For
instance:
int
virCgroupNewVcpu(virCgroupPtr domain,
int vcpuid,
bool create,
virCgroupPtr *group)
{
...
controllers = ((1 << VIR_CGROUP_CONTROLLER_CPU) |
(1 << VIR_CGROUP_CONTROLLER_CPUACCT) |
(1 << VIR_CGROUP_CONTROLLER_CPUSET));
if (virCgroupNew(-1, name, domain, controllers, group) < 0)
goto cleanup;
}
Even though it's highly unlikely that so many new controllers will be
invented so that we would overflow when constructing the bitmask, it
doesn't hurt to check at compile time either.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When creating new internal representation of cgroups, all passed
arguments are logged. Well, except for two: pid and pointer for
return value. Lets log them too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The function has no argument named @name rather than @path
instead. The comment is, however, referring to @name while it
should have been referring to @path really.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit id '2dbfa716' exposed virCgroupDetectMountsFromFile, but did not
add the corresponding entry in the "#else /* !VIR_CGROUP_SUPPORTED */"
section of the module.
Commit id 'ba1dfc5' added virCgroupSetCpusetMemoryMigrate and
virCgroupGetCpusetMemoryMigrate, but did not add the corresponding
entry points into the "#else /* !VIR_CGROUP_SUPPORTED */" section
Commint 0473b45cc introduced new function virNetlinkDelLink, but in
it's counterpart for non-linux platform there should be ATTRIBUTE_UNUSED
instead of ATTRIBUTE_UNSUPPORTED.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Blockcopy to non-file destination is not supported according the code,
but a 'goto endjob' is missed after checking the destination.
This leads to calling drive-mirror with wrong parameters.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1206406
Signed-off-by: Shanzhi Yu <shyu@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Current libvirt can only handle up to 1023 bytes when it
reads Linux sysfs topology/thread_siblings. This isn't enough for
Linux distributions that support a large value. This patch fixes
the problem by using VIR_ALLOC()/VIR_FREE(), instead of using a
fixed-size (1024) local char array. In the meanwhile
SYSFS_THREAD_SIBLINGS_LIST_LENGTH_MAX is increased to 8192 which
should be large enough for a foreseeable future.
Signed-off-by: Wei Huang <wei@redhat.com>
Just as it is possible to delete a bridge device with the netlink
RTM_DELLINK message, one can be created with the RTM_NEWLINK
message. Because of differences in the format of the message, it's not
as straightforward as with virNetlinkDelLink() to create a single
utility function that can be used to create any type of interface, so
the new netlink version of virNetDevBridgeCreate() does its own
construction of the netlink message and calls virNetlinkCommand()
itself.
This doesn't provide any extra functionality, just provides symmetry
with the previous commit.
NB: We *could* alter the API of virNetDevBridgeCreate() to take a MAC
address, and directly program that mac address into the bridge (by
adding an IFLA_ADDRESS attribute, as is done in
virNetDevMacVLanCreate()) rather than separately creating the "dummy
tap" (e.g. virbr0-nic) to maintain a fixed mac address on the bridge,
but the commit history of virnetdevbridge.c shows that the presence of
this dummy tap is essential in some older versions of the kernel
(between 2.6.39 and 3.1 or 3.2, possibly?) to proper operation of IPv6
DAD, and I don't want to take the chance of breaking something that I
don't have the time/setup to test (my RHEL6 box is at kernel
2.6.32-544, and the next lowest kernel I have is 3.17)
https://bugzilla.redhat.com/show_bug.cgi?id=1125755
reported that a stray bridge device was left on the system when a
libvirt network failed to start due to an illegal iptables rule caused
by bad config. Apparently the reason this was happening was that
NetworkManager was noticing immediately when the bridge device was
created and automatically setting it IFF_UP. libvirt would then try to
setup the iptables rules, get an error back, and since libvirt had
never IFF_UPed the bridge, it didn't expect that it needed to set it
~IFF_UP before deleting it during the cleanup process. But the
ioctl(SIOCBRDELBR) ioctl will fail to delete a bridge if it is IFF_UP.
Since that bug was reported, NetworkManager has gotten a bit more
polite in this respect, but just in case something similar happens in
the future, this patch switches to using the netlink RTM_DELLINK
message to delete the bridge - unlike SIOCBRDELBR, it will delete the
requested bridge no matter what the setting of IFF_UP.
These two functions are identical, so no sense in having the
duplication. I resisted the temptation to replace calls to
virNetDevMacVLanDelete() with calls to virNetlinkDelLink() just in
case some mythical future platform has macvtap devices that aren't
managed with netlink (or in case we some day need to do more than just
tell the kernel to delete the device).
libvirt has always used the netlink RTM_DELLINK message to delete
macvtap/macvlan devices, but it can actually be used to delete other
types of network devices, such as bonds and bridges. This patch makes
virNetDevMacVLanDelete() available as a generic function so it can
intelligibly be called to delete these other types of interfaces.
Starting a qemu VM with a memory module that has the base address
specified results in the following error:
error: internal error: early end of file from monitor: possible problem:
2015-03-26T03:45:52.338891Z qemu-kvm: -device pc-dimm,node=0,memdev=memdimm0,
id=dimm0,slot=0,base=4294967296: Property '.base' not found
The correct property name for the base address is 'addr'.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Because of the microcode update to Haswell/Broadwell CPUs, existing
domains using these CPUs may fail to start even though they used to run
just fine. To help users solve this issue we try to suggest switching to
-noTSX variant of the CPU model:
virsh # start cd
error: Failed to start domain cd
error: unsupported configuration: guest and host CPU are not
compatible: Host CPU does not provide required features: rtm, hle;
try using 'Haswell-noTSX' CPU model
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
QEMU 2.3 adds these new models to cover Haswell and Broadwell CPUs with
updated microcode. Luckily, they also reverted former the machine type
specific changes to existing models. And since these changes were never
released, we don't need to hack around them in libvirt.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Use the virStorageSourceIsEmpty helper to determine whether the drive
source is empty rather than checking for src->path. This will fix start
of VM with empty network cdrom that would not report any error.
The function that formats the string for network drives would return
error code but did not set the error message when called on storage
source with VIR_STORAGE_NET_PROTOCOL_LAST or _NONE.
Report an error in this case if it would ever be called in that way.
In some circumstances where the build tree differs from the source,
libvirt's compile will try to create the symlink for cpu_map.xml before
creating the directory $(abs_builddir)/cpu:
'src/cpu/cpu_map.xml': No such file or directory'
Do not create the symlink, it is no longer needed after
commit e562e82f
Load CPU map from builddir when run uninstalled
Signed-off-by: Amy Fong <amy.fong@windriver.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Recently we've fixed a bug where the status XML could not be parsed as
the parser used absolute path XPath queries. This test enhancement tests
all XML files used in the qemu-xml-2-xml test as a part of a status XML
snippet to see whether they are parsed correctly. The status XML-2-XML is
currently tested in 223 cases with this patch.
The current auto-indentation buffer code applies indentation only on
complete strings. To allow adding a string containing newlines and
having it properly indented this patch adds virBufferAddStr.
While this thread is cleaning up the client and connection objects:
#2 virFileReadAll (path=0x7f28780012b0 "/proc/1319/stat", maxlen=maxlen@entry=1024, buf=buf@entry=0x7f289c60fc40) at util/virfile.c:1287
#3 0x00007f28adbb1539 in virProcessGetStartTime (pid=<optimized out>, timestamp=timestamp@entry=0x7f289c60fc98) at util/virprocess.c:838
#4 0x00007f28adb91981 in virIdentityGetSystem () at util/viridentity.c:151
#5 0x00007f28ae73f17c in remoteClientFreeFunc (data=<optimized out>) at remote.c:1131
#6 0x00007f28adcb7f33 in virNetServerClientDispose (obj=0x7f28aecad180) at rpc/virnetserverclient.c:858
#7 0x00007f28adba8eeb in virObjectUnref (anyobj=<optimized out>) at util/virobject.c:265
#8 0x00007f28ae74ad05 in virNetServerHandleJob (jobOpaque=<optimized out>, opaque=0x7f28aec93ff0) at rpc/virnetserver.c:205
#9 0x00007f28adbbef4e in virThreadPoolWorker (opaque=opaque@entry=0x7f28aec88030) at util/virthreadpool.c:145
In stack frame #6 the client->identity object got unref'd, but the code
that removes the event callbacks in frame #5 did not run yet as we are
trying to obtain the system identity (frames #4, #3, #2).
In other thead:
#0 virObjectUnref (anyobj=anyobj@entry=0x7f288c162c60) at util/virobject.c:264
klass = 0xdeadbeef
obj = 0x7f288c162c60
#1 0x00007f28ae71c709 in remoteRelayDomainEventCheckACL (client=<optimized out>, conn=<optimized out>, dom=dom@entry=0x7f28aecaafc0) at remote.c:164
#2 0x00007f28ae71fc83 in remoteRelayDomainEventTrayChange (conn=<optimized out>, dom=0x7f28aecaafc0, ... ) at remote.c:717
#3 0x00007f28adc04e53 in virDomainEventDispatchDefaultFunc (conn=0x7f287c0009a0, event=0x7f28aecab1a0, ...) at conf/domain_event.c:1455
#4 0x00007f28adc03831 in virObjectEventStateDispatchCallbacks (callbacks=<optimized out>, ....) at conf/object_event.c:724
#5 virObjectEventStateQueueDispatch (callbacks=0x7f288c083730, queue=0x7fff51f90030, state=0x7f288c18da20) at conf/object_event.c:738
#6 virObjectEventStateFlush (state=0x7f288c18da20) at conf/object_event.c:816
#7 virObjectEventTimer (timer=<optimized out>, opaque=0x7f288c18da20) at conf/object_event.c:562
#8 0x00007f28adb859cd in virEventPollDispatchTimeouts () at util/vireventpoll.c:459
Frame #0 is unrefing an invalid identity object while frame #2 hints
that the client is still dispatching the event.
For untrimmed backtrace see the bugzilla attachment.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1203030
Don't unref the old identity unless we set the new one correctly and
unref the new one on failure to set it so that we don't leak any
references or use invalid pointers.
While adding tests for status XML parsing and formatting I've noticed
that the device alias list is leaked.
==763001== 81 (48 direct, 33 indirect) bytes in 1 blocks are definitely lost in loss record 414 of 514
==763001== at 0x4C2B8F0: calloc (vg_replace_malloc.c:623)
==763001== by 0x6ACF70F: virAllocN (viralloc.c:191)
==763001== by 0x447B64: qemuDomainObjPrivateXMLParse (qemu_domain.c:727)
==763001== by 0x6B848F9: virDomainObjParseXML (domain_conf.c:15491)
==763001== by 0x6B84CAC: virDomainObjParseNode (domain_conf.c:15608)
When starting a VM with hotpluggable memory devices the user may specify
an invalid source NUMA node. Libvirt would pass through the error from
qemu:
# virsh start test3
error: Failed to start domain test3
error: internal error: process exited while connecting to monitor:
2015-03-25T01:12:17.205913Z qemu-kvm: -object memory-backend-ram,id=memdimm0
,size=536870912,host-nodes=1-3,policy=bind: cannot bind memory to host NUMA nodes:
Invalid argument
This patch adds a check that allows to report better error:
# virsh start test3
error: Failed to start domain test3
error: configuration unsupported: NUMA node 1 is unavailable
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Commit 95695388 introduced new util/virthreadjob.c/h files but the
makefile has type that breaks rpm build.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
This is very helpful when we want to log and report why we could not
acquire a state change lock. Reporting what job keeps it locked helps
with understanding the issue. Moreover, after calling
virDomainGetControlInfo, it's possible to tell whether libvirt is just
stuck somewhere within the API (or it just forgot to cleanup the job) or
whether libvirt is waiting for QEMU to reply.
The error message will look like the following:
# virsh resume cd
error: Failed to resume domain cd
error: Timed out during operation: cannot acquire state change lock
(held by remoteDispatchDomainSuspend)
https://bugzilla.redhat.com/show_bug.cgi?id=853839
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Every thread created as a worker thread within a pool gets a name
according to virThreadPoolJobFunc name.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Automatically assign a job to every thread created by virThreadCreate.
The name of the virThreadFunc function passed to virThreadCreate is used
as the job or worker name in case no name is explicitly passed.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
We want all threads to be set as workers or to have a job assigned to
them, which can easily be achieved in virThreadCreate wrapper to
pthread_create. Let's make sure we always use the wrapper.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Each thread can use a thread local variable to keep the name of a job
which is currently running in the job.
The virThreadJobSetWorker API is supposed to be called once by any
thread which is used as a worker, i.e., it is waiting in a pool, woken
up to do a job, and returned back to the pool.
The virThreadJobSet/virThreadJobClear APIs are to be called at the
beginning/end of each job.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Although needed in the Xen 4.1 libxl days, there is no longer any
benefit to having per-domain libxl_ctx. On the contrary, their use
makes the code unecessarily complicated and prone to deadlocks under
load. As suggested by the libxl maintainers, use a single libxl_ctx
as a handle to libxl instead of per-domain ctx's.
One downside to using a single libxl_ctx is there are no longer
per-domain log files for log messages emitted by libxl. Messages
for all domains will be sent to /var/log/libvirt/libxl/libxl-driver.log.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
libxlDomainFreeMem() is only used in libxl_domain.c and thus should
be declared static. While at it, change the signature to take a
libxl_ctx instead of libxlDomainObjPrivatePtr, since only the
libxl_ctx is needed.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
This function now only enables domain death events. Simply call
libxl_evenable_domain_death() instead of an unnecessary wrapper.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Informing libxl how to handle its child proceses should be done once
during driver initialization, not once for each domain-specific
libxl_ctx object. The related libxl documentation in
$xen-src/tools/libxl/libxl_event.h even mentions that "it is best to
call this at initialisation".
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Long ago I incorrectly associated libxl fd and timer registrations
with per-domain libxl_ctx objects. When creating a libxlDomainObjPrivate,
a libxl_ctx is allocated, and libxl_osevent_register_hooks is called
passing a pointer to the libxlDomainObjPrivate. When an fd or timer
registration occurred, the registration callback received the
libxlDomainObjPrivate, containing the per-domain libxl_ctx. This
libxl_ctx was then used when informing libxl about fd events or
timer expirations.
The problem with this approach is that fd and timer registrations do not
share the same lifespan as libxlDomainObjPrivate, and hence the per-domain
libxl_ctx ojects. The result is races between per-domain libxl_ctx's being
destoryed and events firing on associated fds/timers, typically manifesting
as an assert in libxl
libxl_internal.h:2788: libxl__ctx_unlock: Assertion `!r' failed
There is no need to associate libxlDomainObjPrivate objects with libxl's
desire to use libvirt's event loop. Instead, the driver-wide libxl_ctx can
be used for the fd and timer registrations.
This patch moves the fd and timer handling code away from the
domain-specific code in libxl_domain.c into libxl_driver.c. While at it,
function names were changed a bit to better describe their purpose.
The unnecessary locking was also removed since the code simply provides a
wrapper over the event loop interface. Indeed the locks may have been
causing some deadlocks when repeatedly creating/destroying muliple domains.
There have also been rumors about such deadlocks during parallel OpenStack
Tempest runs.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
We don't have to modify cpuset.mems on hosts without NUMA. It also
fixes an error message that you get instead of success if you trying
update vcpus of a guest on a host without NUMA.
error: internal error: NUMA isn't available on this host
Signer-off-by: Pavel Hrdina <phrdina@redhat.com>
We should call virDomainLiveConfigHelperMethod ASAP because this
function transfers VIR_DOMAIN_AFFECT_CURRENT to VIR_DOMAIN_AFFECT_LIVE
or VIR_DOMAIN_AFFECT_CONFIG. All other additional checks for those two
flags should consider that the user give us VIR_DOMAIN_AFFECT_CURRENT.
Remove the unnecessary check whether the domain is live in case of
VIR_DOMAIN_VCPU_GUEST because this check is done by
virDomainLiveConfigHelperMethod.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
by rewriting it completely from:
error: unsupported configuration: virtio only support device address
type 'PCI'
to:
error: unsupported configuration: virtio disk cannot have an address of type
drive
Since we now support CCW addresses as well.
While debugging the support for responding to qemu RX_FILTER_CHANGED
events, I had changed the "ignoring this event" log message from
VIR_DEBUG to VIR_WARN, but forgot to change it back before
pushing. Since many guest OSes make enough changes to multicast lists
and/or promiscuous mode settings to trigger this message, it's
starting to show up as a red herring in bug reports.
Commit 5bba61f changed the XPath strings to be absolute when parsing
the VM NUMA configuration. Unfortunately the <domain> element is not a
top level element when parsing the domain status XML thus the absolute
XPath string doesn't match.
Use the relative string so that the <numa> settings are not lost.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Add a few helpers that allow to operate with memory device definitions
on the domain config and use them to implement memory device coldplug in
the qemu driver.
Add support to start qemu instance with 'pc-dimm' device. Thanks to the
refactors we are able to reuse the existing function to determine the
parameters.
Make sure that libvirt has all vital information needed to reliably
represent configuration of guest's memory devices in case of a
migration.
This patch forbids migration in case the required slot number and module
base address are not present (failed to be loaded from qemu via
monitor).
When using 'dimm' memory devices with qemu, some of the information
like the slot number and base address need to be reloaded from qemu
after process start so that it reflects the actual state. The state then
allows to use memory devices across migrations.
This patch adds code that parses and formats configuration for memory
devices.
A simple configuration would be:
<memory model='dimm'>
<target>
<size unit='KiB'>524287</size>
<node>0</node>
</target>
</memory>
A complete configuration of a memory device:
<memory model='dimm'>
<source>
<pagesize unit='KiB'>4096</pagesize>
<nodemask>1-3</nodemask>
</source>
<target>
<size unit='KiB'>524287</size>
<node>1</node>
</target>
</memory>
This patch preemptively forbids use of the <memory> device in individual
drivers so the users are warned right away that the device is not
supported.
To enable memory hotplug the maximum memory size and slot count need to
be specified. As qemu supports now other units than mebibytes when
specifying memory, use the new interface in this case.
Add a XML element that will allow to specify maximum supportable memory
and the count of memory slots to use with memory hotplug.
To avoid possible confusion and misuse of the new element this patch
also explicitly forbids the use of the maxMemory setting in individual
drivers's post parse callbacks. This limitation will be lifted when the
support is implemented.
With the current control flow the post parse callback returned success
right away for fully virtualized VMs. To allow adding additional checks
into the post parse callback tweak the conditions so that the function
doesn't return early except for error cases.
To clarify the original piece of code borrow the wording from the commit
message for the patch that introduced the code.
In the last section if the function determines that the config is
invalid when QEMU doesn't support the memory device the JSON config
object would be returned even if it doesn't make sense.
Assign the object to be returned only on success.
When no model is specified in the domain definition for
a scsi controller and the architectur is s390 than virtio-scsi
is set as default model.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com>
Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Always add udev linker flags when WITH_UDEV is enabled to avoid
underlinking.
See commit 43dbcb15 (interface: always build all available backends)
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Commit cf54c60699 introduced the ability
to create missing storage volumes during migration. For network disks,
however, we may not necessarily be able to detect whether they already
exist -- there is no straight-forward way to map the disk to a storage
volume, and even if there were it's possible no configured storage pool
actually contains the disk.
It is better to assume the network disk exists in this case, rather than
aborting the migration completely. If the volume really is missing, QEMU
will generate an appropriate error later in the migration.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
This function does not make any sense now, that network driver is
(almost) dropped. I mean, previously, when threads were
serialized, this function was there to check, if no other network
with the same name or UUID exists. However, nowadays that threads
can run more in parallel, this function is useless, in fact it
gives misleading return values. Consider the following scenario.
Two threads, both trying to define networks with same name but
different UUID (e.g. because it was generated during XML parsing
phase, whatever). Lets assume that both threads are about to call
networkValidate() which immediately calls
virNetworkObjIsDuplicate().
T1: calls virNetworkObjIsDuplicate() and since no network with
given name or UUID exist, success is returned.
T2: calls virNetworkObjIsDuplicate() and since no network with
given name or UUID exist, success is returned.
T1: calls virNetworkAssignDef() and successfully places its
network into the virNetworkObjList.
T2: calls virNetworkAssignDef() and since network with the same
name exists, the network definition is replaced.
Okay, this is mainly because virNetworkAssignDef() does not check
whether name and UUID matches. Well, lets make it so! And drop
useless function too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
There's no need to lock the network driver, as network driver
initialization is done prior accepting any client. There's nobody
to hop in and do something over partially initialized driver. Nor
qemu driver is doing that.
==30532== Observed (incorrect) order is: acquisition of lock at 0x1439EF50
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x5307E86: virObjectLock (virobject.c:323)
==30532== by 0x5396440: virNetworkObjListForEach (network_conf.c:4511)
==30532== by 0x19B29308: networkStateInitialize (bridge_driver.c:686)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532== by 0x4C30456: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0xA1EC1F2: start_thread (in /lib64/libpthread-2.19.so)
==30532== by 0xA4EDC8C: clone (in /lib64/libc-2.19.so)
==30532==
==30532== followed by a later acquisition of lock at 0x1439CD60
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x19B27B2C: networkDriverLock (bridge_driver.c:102)
==30532== by 0x19B27B60: networkGetDnsmasqCaps (bridge_driver.c:113)
==30532== by 0x19B2856A: networkUpdateState (bridge_driver.c:389)
==30532== by 0x53963E9: virNetworkObjListForEachHelper (network_conf.c:4488)
==30532== by 0x52E2224: virHashForEach (virhash.c:521)
==30532== by 0x539645B: virNetworkObjListForEach (network_conf.c:4512)
==30532== by 0x19B29308: networkStateInitialize (bridge_driver.c:686)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532==
==30532== Required order was established by acquisition of lock at 0x1439CD60
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x19B27B2C: networkDriverLock (bridge_driver.c:102)
==30532== by 0x19B28DF9: networkStateInitialize (bridge_driver.c:609)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532== by 0x4C30456: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0xA1EC1F2: start_thread (in /lib64/libpthread-2.19.so)
==30532== by 0xA4EDC8C: clone (in /lib64/libc-2.19.so)
==30532==
==30532== followed by a later acquisition of lock at 0x1439EF50
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x5307E86: virObjectLock (virobject.c:323)
==30532== by 0x538A09C: virNetworkAssignDef (network_conf.c:527)
==30532== by 0x5391EB2: virNetworkLoadState (network_conf.c:3008)
==30532== by 0x53922D4: virNetworkLoadAllState (network_conf.c:3128)
==30532== by 0x19B2929A: networkStateInitialize (bridge_driver.c:671)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532== by 0x4C30456: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0xA1EC1F2: start_thread (in /lib64/libpthread-2.19.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Wikipedia's list of common misspellings [1] has a machine-readable
version. This patch fixes those misspellings mentioned in the list
which don't have multiple right variants (as e.g. "accension", which can
be both "accession" and "ascension"), such misspellings are left
untouched. The list of changes was manually re-checked for false
positives.
[1] https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
We've never set the cpuset.memory_migrate value to anything, keeping it
on default. However, we allow changing cpuset.mems on live domain.
That setting, however, don't have any consequence on a domain unless
it's going to allocate new memory.
I managed to make 'virsh numatune' move all the memory to any node I
wanted even without disabling libnuma's numa_set_membind(), so this
should be safe to use with it as well.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1198497
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
xen.git commit babeca32 added a pkgconfig file for libxenlight,
allowing libxl apps to determine the location of Xen binaries
such as firmware blobs, device emulator, etc.
This patch adds support for xenlight.pc in the libxl driver, falling
back to the previous configure logic if not found. It introduces
LIBXL_FIRMWARE_DIR and LIBXL_EXECBIN_DIR to define the firmware and
libexec_bin locations. If xenlight.pc does not exist, the defines
are set to the current hardcoded paths. The capabilities'
<emulator> and <loader> elements are updated to use the paths.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
If, by any reason, parallelsNetworkOpen fails it dereferences
newly allocated privconn->networks via virObjectUnref, which in
turn deallocates its memory.
Subsequent call of parallelsNetworkClose calls virObjectUnref
that leads to double memory free. To prevent this we should zero
privconn->networks to make all subsequent virObjectUnref be safe.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1196934
When qemu exits during startup, libvirt includes the error from
/var/log/libvirt/qemu/vm.log in the error message:
$ virsh start test3
error: Failed to start domain test3
error: internal error: early end of file from monitor: possible problem:
2015-02-27T03:03:16.985494Z qemu-kvm: -numa memdev is not supported by
machine rhel6.5.0
The check for domain liveness added to qemuDomainObjExitMonitor
in commit dc2fd51f sometimes overwrites this error:
$ virsh start test3
error: Failed to start domain test3
error: operation failed: domain is no longer running
Fix the check to only report an error if there is none set.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
When converting domXML from native, the libxl driver was overwriting
useful errors from the xenconfig parsing code with a useless, generic
error. E.g. "internal error: parsing xm config failed" vs
"internal error: config value usbdevice was malformed". Remove the
redundant (and useless) error reporting in the libxl driver.
Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Issue #1 - A call to virBitmapNew did not check if the allocation
failed which could lead to a NULL dereference
Issue #2 - When deleting the pin entries from the config file, the
code loops from the number of elements down to the "new" vcpu count;
however, the pin id values are numbered 0..n-1 not 1..n, so the "first"
pin attempt would never work. Luckily the check was for whether the
incoming 'n' (vcpu id) matched the entry in the array from 0..arraysize
rather than a dereference of the 'n' entry
In qemu 2.3, the migration status will include 'cancelling' in the
window between when an asynchronous cancel has been requested and
when the migration is actually halted. Previously, qemu hid this
state and reported 'active'. Libvirt manages the sequence okay
even when the string is unrecognized (that is, it will report an
unknown state:
Migration: [ 69 %]^Cerror: internal error: unexpected migration status in cancelling.
but the migration is still cancelled), but recognizing the string
makes for a smoother user experience.
* src/qemu/qemu_monitor.h
(QEMU_MONITOR_MIGRATION_STATUS_CANCELLING): Add enum.
* src/qemu/qemu_monitor.c (qemuMonitorMigrationStatus): Map it.
* src/qemu/qemu_migration.c (qemuMigrationUpdateJobStatus): Adjust
clients.
* src/qemu/qemu_monitor_json.c
(qemuMonitorJSONGetMigrationStatusReply): Likewise.
Signed-off-by: Eric Blake <eblake@redhat.com>
Investigation of a problem with creating passthrough macvtap devices
(https://bugzilla.redhat.com/show_bug.cgi?id=1185501) has shown that
this slightly more verbose failure message is useful. In particular,
the mac address can be used to determine the domain. You could also
figure this out by looking at preceding messages in a debug log, but
this gets it in a single place.
virnetdevopenvswitch.h declares a few functions that can be called to
add ports to and remove them from OVS bridges, and retrieve the
migration data for a port. It does not contain any data definitions
that are used by domain_conf.h. But for some reason, domain_conf.h
virnetdevopenvswitch.h should be directly #including it. This adds a
few lines to the project, but saves all the files that don't need it
from the extra computing, and makes the dependencies more clear cut.
Problem Description:
When we set boot order for a vhost-user network interface, we found the boot index
doesn't work.
Cause of the Problem:
In the function qemuBuildVhostuserCommandLine(), it forcely set the arg bootindex of
function qemuBuildNicDevStr() to 0. Thus, the bootindex parameter got missing.
Solution:
Trans the arg bootindex down.
Signed-off-by: Gao Haifeng <gaohaifeng.gao@huawei.com>
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
which is on by default when a new VM/CT is created.
We should do this because this feature can't be controlled
by libvirt now and it sets up some iptables rules. So it's
better to do this to avoid potential conflict of different
set of rules or to avoid unexpected behavior.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
In order to support 'bridge' network adapters in parallels
driver we need to plug our veth devices into corresponding
linux bridges.
We are going to do this by reusing our abstraction of
Virtual Networks in terms of PCS. On a domain creation, we
create a new Virtual Network naming it with the same name
as a source bridge for each network interface.
Having done this, we plug PCS veth interfaces created with names of
target dev into specified bridges using our standard PCS procedures
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Don't fail initialization of parallels driver if
parallelsLoadNetwork fails for optional networks.
This can happen when some of them are added manually
and configured incompletely. PCS requires only two networks
created automatically (named Host-Only and Bridged), others
are optional and their incompleteness can be ignored.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The following is a long winded way to say this patch is avoiding a
false positive.
Coverity complains that calling networkPlugBandwidth() could eventually
end up with a NULL dereference on iface->bandwidth because in the
networkAllocateActualDevice there's a check of 'iface->bandwidth'
before deciding to try to use the 'portgroup' if it exists or to not
perferm the virNetDevBandwidthCopy if 'bandwidth' is not NULL.
Later in networkPlugBandwidth the 'iface->bandwidth' is sourced from
virDomainNetGetActualBandwidth - which would be either iface->bandwidth
or (preferably) iface->data.network.actual->bandwidth which would have
been filled in from either 'iface->bandwidth' or 'portgroup->bandwidth'
back in networkAllocateActualDevice
There *is* a check in networkCheckBandwidth for the result of the
virDomainNetGetActualBandwidth being NULL and a return 1 based on
that which would cause networkPlugBandwidth to exit properly and thus
never hit the condition that Coverity complains about.
However, since Coverity checks all paths - it somehow believes that
a return of 0 by networkCheckBandwidth in this condition would end
up causing the possible NULL dereference. The "fix" to silence Coverity
is to not have networkCheckBandwidth also call virDomainNetGetActualBandwidth
in order to get the ifaceBand, but rather have it accept it as an argument
which causes Coverity to "see" that it's the exit condition of 1 that won't
have the possible NULL dereference. Since we're passing that, I added the
passing of iface->mac rather than passing iface as well. This just hopefully
makes sure someone doesn't undo this in the future...
When libvirt is starting a domain, it reports the state as SHUTOFF until
it's RUNNING. This is not ideal because domain startup may take a long
time (usually because of some configuration issues, firewalls blocking
access to network disks, etc.) and domain lists provided by libvirt look
awkward. One can see weird shutoff domains with IDs in a list of active
domains or even shutoff transient domains. In any case, it looks more
like a bug in libvirt than a normal state a domain goes through.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
The function needs a pointer to the network to get list of DHCP
leases. The pointer is obtained via virNetworkLookupByName() which
requires callers to free the returned network once no longer needed.
Otherwise it's leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we allow HW address to be not present on our RPC layer,
don't error out if qemu-ga hasn't provided any.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Not all NICs (esp. the virtual ones like TUN) must have a hardware
address. Teach our RPC that it's possible.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1199182 documents that
after a series of disk snapshots into existing destination images,
followed by active commits of the top image, it is possible for
qemu 2.2 and earlier to end up tracking a different name for the
image than what it would have had when opening the chain afresh.
That is, when starting with the chain 'a <- b <- c', the name
associated with 'b' is how it was spelled in the metadata of 'c',
but when starting with 'a', taking two snapshots into 'a <- b <- c',
then committing 'c' back into 'b', the name associated with 'b' is
now the name used when taking the first snapshot.
Sadly, older qemu doesn't know how to treat different spellings of
the same filename as identical files (it uses strcmp() instead of
checking for the same inode), which means libvirt's attempt to
commit an image using solely the names learned from qcow2 metadata
fails with a cryptic:
error: internal error: unable to execute QEMU command 'block-commit': Top image file /tmp/images/c/../b/b not found
even though the file exists. Trying to teach libvirt the rules on
which name qemu will expect is not worth the effort (besides, we'd
have to remember it across libvirtd restarts, and track whether a
file was opened via metadata or via snapshot creation for a given
qemu process); it is easier to just always directly ask qemu what
string it expects to see in the first place.
As a safety valve, we validate that any name returned by qemu
still maps to the same local file as we have tracked it, so that
a compromised qemu cannot accidentally cause us to act on an
incorrect file.
* src/qemu/qemu_monitor.h (qemuMonitorDiskNameLookup): New
prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskNameLookup):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskNameLookup): New function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskNameLookup)
(qemuMonitorJSONDiskNameLookupOne): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit)
(qemuDomainBlockJobImpl): Use it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Use the utilities introduced in the previous patches so the qemu
driver is able to create tap devices that are bound (and unbound
on domain destroyal) to Midonet virtual ports.
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
Midonet is an opensource virtual networking that over lays the IP
network between hypervisors. Currently, such networks can be made
with the openvswitch virtualport type.
This patch, defines the schema and documentation that will serve
as basis for the follow up patches that will add support to libvirt
for using Midonet virtual ports for its interfaces. The schema
definition requires that the port profile expresses its interfaceid
as part of the port profile. For that reason, this is part of the
patch too.
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
Adds the port type definitions and methods that will be used to bind
interfaces to the Midonet virtual ports.
virtnetdevmidonet.c adds the way to bind and unbind the ports by
calling into the Midonet Host Agent control command line (installed
with the midolman package).
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
Previously we had to check for 3 fields to see if the source was filled.
Repurpose one of the variables as a boolean flag and use it instead of
combining multiple sources.
For the condition that checks that only CDROM/FLOPPY drives can be empty
we can use the virStorageSourceIsEmpty() helper.
If the storage device type is parsed as network our parser still allows
it to omit the <source> element. The empty drive check would not trigger
on such device as it expects that every network storage source is valid.
Use VIR_STORAGE_NET_PROTOCOL_NONE as a marker that the storage source is
empty.
Only selected fields from the disk source were copied when cold updating
source in a CDROM drive. When such drive was backed by a network file
this resulted into corruption of the definition:
<disk type='network' device='cdrom'>
<driver name='qemu' type='raw' cache='none'/>
<source protocol='gluster' name='gluster-vol1(null)'>
<host name='localhost'/>
</source>
<target dev='vdc' bus='virtio'/>
<readonly/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
</disk>
Update the whole source instead of cherry-picking elements.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1166024
By querying the qemu guest agent with the QMP command
"guest-network-get-interfaces" and converting the received JSON
output to structured objects.
Although "ifconfig" is deprecated, IP aliases created by "ifconfig"
are supported by this API. The legacy syntax of an IP alias is:
"<ifname>:<alias-name>". Since we want all aliases to be clubbed
under parent interface, simply stripping ":<alias-name>" suffices.
Note that IP aliases formed by "ip" aren't visible to "ifconfig",
and aliases created by "ip" do not have any specific name. But
we are lucky, as qemu guest agent detects aliases created by both.
src/qemu/qemu_agent.h:
* Define qemuAgentGetInterfaces
src/qemu/qemu_agent.c:
* Implement qemuAgentGetInterface
src/qemu/qemu_driver.c:
* New function qemuGetDHCPInterfaces
* New function qemuDomainInterfaceAddresses
src/remote_protocol-sructs:
* Define new structs
tests/qemuagenttest.c:
* Add new test: testQemuAgentGetInterfaces
Test cases for IP aliases, 0 or multiple ipv4/ipv6 address(es)
Signed-off-by: Nehal J Wani <nehaljw.kkd1@gmail.com>
daemon/remote.c
* Define remoteSerializeDomainInterface, remoteDispatchDomainInterfaceAddresses
src/remote/remote_driver.c
* Define remoteDomainInterfaceAddresses
src/remote/remote_protocol.x
* New RPC procedure: REMOTE_PROC_DOMAIN_INTERFACE_ADDRESSES
* Define structs remote_domain_ip_addr, remote_domain_interface,
remote_domain_interfaces_addresse_args, remote_domain_interface_addresses_ret
* Introduce upper bounds (to handle DoS attacks):
REMOTE_DOMAIN_INTERFACE_MAX = 2048
REMOTE_DOMAIN_IP_ADDR_MAX = 2048
Restrictions on the maximum number of aliases per interface were
removed after kernel v2.0, and theoretically, at present, there
are no upper limits on number of interfaces per virtual machine
and on the number of IP addresses per interface.
src/remote_protocol-structs
* New structs added
Signed-off-by: Nehal J Wani <nehaljw.kkd1@gmail.com>
Define helper function virDomainInterfaceFree, which allows
the upper layer application to free the domain interface object
conveniently.
The API is going to provide multiple methods by flags, e.g.
* Query guest agent
* Parse DHCP lease file
include/libvirt/libvirt-domain.h
* Define virDomainInterfaceAddresses, virDomainInterfaceFree
* Define structs virDomainInterface, virDomainIPAddress
src/driver-hypervisor.h:
* Define domainInterfaceAddresses
src/libvirt-domain.c:
* Implement virDomainInterfaceAddresses
* Implement virDomainInterfaceFree
src/libvirt_public.syms:
* Export the new symbols
Signed-off-by: Nehal J Wani <nehaljw.kkd1@gmail.com>
Failures of parallelsStorageOpen occured because we incorrectly treated
path to VM' configuration file as a directory. Now initialization of
parallels VM domains home directory is fixed.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Otherwise exporting existing domain config and defining a new one like this:
virsh -c parallels:///system dumpxml instance01 > my.xml
virsh -c parallels:///system define my.xml
leads to an error because PCS default x64 mode turns to x32.
Thus, we need to set correct cpuMode in prlsdkDoApplyConfig() explicitly.
Signed-off-by: Mikhail Feoktistov <mfeoktistov@parallels.com>
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We're parsing memballoon status period as unsigned int, but when we're
trying to set it, both we and qemu use signed int. That means large
values will get wrapped around to negative one resulting in error.
Basically the same problem as commit e3a7b874 was dealing with when
updating live domain.
QEMU changed the accepted value to int64 in commit 1f9296b5, but even
values as INT_MAX don't make sense since the value passed means seconds.
Hence adding capability flag for this change isn't worth it.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1140958
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
In order not to leave old error messages set, this patch refactors the
code so the error is reported only when acted upon. The only such place
already rewrites any error, so cleaning up all the error reporting in
qemuMonitorSetMemoryStatsPeriod() is enough.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
All the devices we have format their address as its last sub-element, so
let's change memballoon to follow suit. Also adjust RNG to allow any
order of them so 'virsh edit' doesn't shout at us.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Commit 4ab8cd77 added a check requiring input devices to have
a bus type of VIR_DOMAIN_INPUT_BUS_USB, failing to start the
domain otherwise. But virDomainDefParseXML adds implicit mouse
and keyboard if a graphics device is configured. See calls to
virDomainDefMaybeAddInput.
The regression is fixed by removing the check requiring USB input
devices, and skipping non-USB input devices when populating USB
'usbdevice' in libxl_domain_build_info struct.
Patch 51f9f03a4c introduces a regression
where if a blockCommit operation fails the disk is still marked as being
part of a block job but can't be unmarked later.
As pointed out by jtomko in his review of the IOThreads pinning code:
http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html
there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...
This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...
During his review of the iothreads pin setting code, Pavel noted that
there was a potential memory leak with respect to how the newVcpuPin
is handled and the goto endjob's in failure paths which would not free
the memory. For reference, See:
http://www.redhat.com/archives/libvir-list/2015-March/msg00415.html
Now that the size of guest's memory can be inferred from the NUMA
configuration (if present) make it optional to specify <memory>
explicitly.
To make sure that memory is specified add a check that some form of
memory size was specified. One side effect of this change is that it is
no longer possible to specify 0KiB as memory size for the VM, but I
don't think it would be any useful to do so. (I can imagine embedded
systems without memory, just registers, but that's far from what libvirt
is usually doing).
Forbidding 0 memory for guests also fixes a few corner cases where 0 was
not interpreted correctly and caused failures. (Arguments for numad when
using automatic placement, size of the balloon). This fixes problems
described in https://bugzilla.redhat.com/show_bug.cgi?id=1161461
Test case changes are added to verify that the schema change and code
behave correctly.
Use the NUMA total instead of the configured size both in XML and for
uses in the code once NUMA is enabled for a domain.
One test case change is necessary as the rounding of the individual cell
sizes was not matching the rounding of the total size.
The memory sizes in qemu are aligned up to 1 MiB boundaries. There are
two places where this was done once for the total size and then for
individual NUMA cell sizes.
Add a function that will align the sizes in one place so that it's clear
where the sizes are aligned.
As there are two possible approaches to define a domain's memory size -
one used with legacy, non-NUMA VMs configured in the <memory> element
and per-node based approach on NUMA machines - the user needs to make
sure that both are specified correctly in the NUMA case.
To avoid this burden on the user I'd like to replace the NUMA case with
automatic totaling of the memory size. To achieve this I need to replace
direct access to the virDomainMemtune's 'max_balloon' field with
two separate getters depending on the desired size.
The two sizes are needed as:
1) Startup memory size doesn't include memory modules in some
hypervisors.
2) After startup these count as the usable memory size.
Note that the comments for the functions are future aware and document
state that will be present after a few later patches.
Surprisingly we did not grab a VM job when a block job finished and we'd
happily rewrite the backing chain data. This made it possible to crash
libvirt when queueing two backing chains tightly and other badness.
To fix it, add yet another handler to the helper thread that handles
monitor events that require a job.
We interpret port values as signed int (convert them from char *),
so if a negative value is provided in network disk's configuration,
we accept it as valid, however there's an 'unknown cause' error raised later.
This error is only accidental because we return the port value in the return code.
This patch adds just a minor tweak to the already existing check so we
reject negative values the same way as we reject non-numerical strings.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1163553
Valgrind detected a leak:
==17820== 102 (56 direct, 46 indirect) bytes in 1 blocks are definitely lost in loss record 479 of 646
==17820== at 0x4A08946: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17820== by 0x508521A: virAllocVar (viralloc.c:560)
==17820== by 0x50D9FCA: virObjectNew (virobject.c:193)
==17820== by 0x50A4FD9: dnsmasqCapsNewEmpty (virdnsmasq.c:784)
==17820== by 0x50A514E: dnsmasqCapsNewFromBinary (virdnsmasq.c:830)
==17820== by 0x1B508287: networkStateInitialize (bridge_driver.c:666)
It looks like commit 172acef introduced the problem, because
networkGetDnsmasqCaps() increments the reference count but an
early exit never does a matching decrement.
* src/network/bridge_driver.c (networkStateCleanup): Plug leak.
Signed-off-by: Eric Blake <eblake@redhat.com>
Valgrind complained:
==3770== Syscall param ioctl(SIOCETHTOOL) points to uninitialised byte(s)
==3770== at 0x919D407: ioctl (syscall-template.S:81)
==3770== by 0x530FE7E: rpl_ioctl (ioctl.c:42)
==3770== by 0x50CB433: virNetDevFeatureAvailable (virnetdev.c:2764)
==3770== by 0x50CB6A7: virNetDevGetFeatures (virnetdev.c:2830)
==3770== by 0x1F0E5347: udevProcessNetworkInterface (node_device_udev.c:722)
==3770== by 0x1F0E689F: udevGetDeviceDetails (node_device_udev.c:1300)
==3770== by 0x1F0E6E06: udevAddOneDevice (node_device_udev.c:1422)
==3770== by 0x1F0E6FB8: udevProcessDeviceListEntry (node_device_udev.c:1464)
==3770== by 0x1F0E70CF: udevEnumerateDevices (node_device_udev.c:1494)
==3770== by 0x1F0E7BB4: nodeStateInitialize (node_device_udev.c:1806)
==3770== by 0x51B4303: virStateInitialize (libvirt.c:777)
==3770== by 0x11DEE7: daemonRunStateInit (libvirtd.c:906)
==3770== Address 0x228e38d4 is on thread 12's stack
==3770== in frame #2, created by virNetDevFeatureAvailable (virnetdev.c:2750)
* src/util/virnetdev.c (virNetDevFeatureAvailable): Initialize all
bytes of ifr.
Signed-off-by: Eric Blake <eblake@redhat.com>
Introduced by f6a2f97e
Problem Description:
After multiple times of migrating a domain, which has an ovs interface with no portData set,
with non-shared disk, nbd ports got overflowed.
The steps to reproduce the problem:
1 define and start a domain with its network configured as:
<interface type='bridge'>
<source bridge='br0'/>
<virtualport type='openvswitch'>
</virtualport>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
</interface>
2 do not set the network's portData.
3 migrate(ToURI2) it with flag 91(1011011), which means:
VIR_MIGRATE_LIVE
VIR_MIGRATE_PEER2PEER
VIR_MIGRATE_PERSIST_DEST
VIR_MIGRATE_UNDEFINE_SOURCE
VIR_MIGRATE_NON_SHARED_DISK
4 migrate success, but we got an error log in libvirtd.log:
error : virCommandWait:2423 : internal error: Child process (ovs-vsctl --timeout=5 get Interface
vnet1 external_ids:PortData) unexpected exit status 1: ovs-vsctl: no key "PortData" in Interface
record "vnet1" column external_ids
5 migrate it back, migrate it , migrate it back, .......
6 nbd port got overflowed.
The reasons for the problem is :
1 virNetDevOpenvswitchGetMigrateData() takes it as wrong if no portData is available for the ovs
interface of a domain. (We think it's not appropriate, as portData is just OPTIONAL)
2 in func qemuMigrationBakeCookie(), it fails in qemuMigrationCookieAddNetwork(), and returns with -1.
qemuMigrationCookieAddNBD() is not called thereafter, and mig->nbd is still NULL.
3 However, qemuMigrationRun() just *WARN* if qemuMigrationBakeCookie() fails, migration still successes.
cookie is NULL, it's not baked on the src side.
4 On the destination side, it would alloc a port first and then free the nbd port in COOKIE.
But the cookie is NULL due to qemuMigrationCookieAddNetwork() failure at src side. thus the nbd port
is not freed.
In this patch, we add "--if-exists" option to make ovs-vsctl not raise error if there's no portData available.
Further more, because portData may be NULL in the cookie at the dest side, check it before setting portData.
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
It will not be possible to detach such device later. Also improve
logging in such cases.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
virDomainNetFindIdx no longer returns info whether device was not found,
or there was multiple matches. Additionally it already handle error
reporting. Introduce virDomainHasNet which does a simple task, without
implicit error reporting.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
In Xen>=4.3, libxl supports new syntax for USB devices:
usbdevice=[ "DEVICE", "DEVICE", ... ]
Add support for that in xenconfig driver. When only one device is
defined, keep using old syntax for backward compatibility.
Adjust tests for changed options order.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Now that the network driver lock is ash heap of history,
we can use more of networkObjFromNetwork().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While in previous commits there were some places that relied on
the big lock, in this file there's no such place and the big
driver lock can be dropped completely. Yay!
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Well, if 'everywhere' is defined as that part of the driver code
that serves virNetwork* APIs. Again, we lower layers already have
their locks, so there's no point doing big lock.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that we have fine grained locks, there's no need to
lock the whole driver. We can rely on self-locking APIs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In order to drop network driver lock, lets annotate which
structure items are immutable, which have self-locking
APIs and so on.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is not an immutable pointer and can change during lifetime.
Therefore, in order to drop network driver lock, we must use an
internal accessor which does not lock the network driver yet, but
it will soon. Now it merely returns an referenced object.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Well, network driver code has the driver accessible as a global
variable. This makes any rework hard, as it's unclear where the
variable is accessed and/or modified. Lets just pass the driver
as a parameter to all functions where needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A helper that never returns an error and treats bits out of bitmap range
as false.
Use it everywhere we use ignore_value on virBitmapGetBit, or loop over
the bitmap size.
For VBOX it's most likely that the connection is vbox:///session and it
runs with local non-root account. This caused permission denied when
LOCALSTATEDIR was used to create temp file. This patch makes use of the
virGetUserCacheDirectory to address this problem for non-root users.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Commit c9027d8f added a detection of NIC HW features, but some of them
are not available in old kernel. Very old kernels lack enum
ethtool_flags and even if this enum is present, not all values are
available for all kernels. To be sure that we have everything in kernel
that we need, we must check for existence of most of that flags, because
only few of them were defined at first.
Also to successfully build libvirt with older kernel we need to include
<linux/types.h> before <linux/ethtool.h> to have __u32 and friends
defined.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Coverity notes in xenapiDomainGetXMLDesc that 'vms' is dereferenced
a few times before a "if (vms) xen_vm_set_free(vms);" call is made.
Since we'd exit out much sooner if the fetch of the vms failed, just
remove the unnecessary "if (vms)" check.
Coverity complains that "net_set" is compared to NULL before calling
xen_network_set_free, but used rather liberally before that. While
I was looking at the code I also noted that if the virAsprintfQuiet
fails, then we leak our structures - so I added those too.
Coverity points out that the return from virDomainDefParseString is
not checked in xenapiDomainCreateXML like it should be which could
end up in a NULL pointer dereference
Since inception. Coverity complains that the code checks "(record ==
NULL && !session->ok)", but doesn't check (record != NULL) before
dereferencing at "record->is_a_template"
https://bugzilla.redhat.com/show_bug.cgi?id=1135491
More or less a virtual copy of the existing virDomainVcpuPin{Add|Del} API's.
NB: The IOThreads implementation "reused" the virDomainVcpuPinDefPtr
since it provided everything necessary - an "id" and a "map" for each
thread id configured.
Add virDomainPinIOThread to allow setting the CPU affinity for a specific
IOThread based on the output generated from virDomainGetIOThreadsInfo
The API supports updating both the --live domain and the --config data
This patch turns both virNetworkObjFindByUUID() and
virNetworkObjFindByName() to return an referenced object so that
even if caller unlocks it, it's for sure that object won't
disappear meanwhile. Especially if the object (in general) is
locked and unlocked during the caller run.
Moreover, this commit is nicely small, since the object unrefing
can be done in virNetworkObjEndAPI().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Every API that touches internal structure of the object must lock
the object first. Not every API that has the object as an
argument needs to do that though. Some APIs just pass the object
to lower layers which, however, must lock the object then. Look
at the code, you'll get my meaning soon.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is going to be needed later, when some functions already
have the virNetworkObjList object already locked and need to
lookup a object to work on. As an example of such function is
virNetworkAssignDef(). The other use case might be in
virNetworkObjListForEach() callback.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Later we can turn APIs to lock the object if needed instead of
relying on caller to mutually exclude itself (probably done by
locking a big lock anyway).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, this is pure code replacement. But once we introduce
reference counting to virNetworkObj this will be more handy as
there'll be only one function to change: virNetworkObjEndAPI().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, this is pure code replacement. But once we introduce
reference counting to virNetworkObj this will be more handy as
there'll be only one function to change: virNetworkObjEndAPI().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far, this is pure code replacement. But once we introduce
reference counting to virNetworkObj this will be more handy as
there'll be only one function to change: virNetworkObjEndAPI().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
This is practically copy of qemuDomObjEndAPI. The reason why is
it so widely available is to avoid code duplication, since the
function is going to be called from our bridge driver, test
driver and parallels driver too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
So far it's just a structure which happens to have 'Obj' in its
name, but otherwise it not related to virObject at all. No
reference counting, not virObjectLock(), nothing.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Now that qemuDomainBlocksStatsGather provides functions of both
qemuMonitorGetBlockStatsParamsNumber and qemuMonitorGetBlockStatsInfo we
can reuse it and kill a lot of code.
Additionally as a bonus qemuDomainBlockStatsFlags will now support
summary statistics so add a statement to the virsh man page about that.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1142636
In the LXC driver, if the disk path is not provided the API returns
total statistics for all disks of the domain. With the new text monitor
implementation this can be now done in the qemu driver too.
Add code that wil total the stats for all disks if the path is not
provided.
Extract the code to look up the disk alias and return the block stats
struct so that it can be reused later in qemuDomainBlockStatsFlags.
The function uses qemuMonitorGetAllBlockStatsInfo instead of
qemuMonitorGetBlockStatsInfo.
Our virDomainBlockStatsFlags API uses the old approach where, when it's
called without the typed parameter array, returns the count of parameters
supported by qemu.
The supported parameter count is obtained via separate monitor calls
which is a waste since we can calculate it when gathering the data.
This patch adds code to the qemuMonitorGetAllBlockStatsInfo workers that
allows to track the count of supported fields reported by qemu and will
allow to remove the old duplicate code.
The function that is extracting block stats data from the QMP monitor
reply contains a lot of repeated code. Since I'd be changing each of the
copies in the next patch, lets convert it to a macro right away.
Add a different version of parser for "info blockstats" that basically
parses the same information as the existing copy of the function.
This will allow us to remove the single device version
qemuMonitorGetBlockStatsInfo in the future.
The new implementation uses few new helpers so it should be more
understandable and provides a test case to verify that it works.
Allocate the hash table in the monitor wrapper function instead of the
worker itself so that the text monitor impl that will be added in the
next patch doesn't have to duplicate it.
If a domain object is being removed and looked up concurrently we must
ensure we unlock the object before unreferencing it, since the latter
might free the object.
The flaw was introduced in commit feb1a4d792.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Undefining a running, autostarted domain removes the autostart link, but
dom->autostart is not cleared. If the domain is subsequently redefined,
libvirt thinks it is already autostarted and will not create the link
even if requested:
# virsh dominfo example | grep Autostart
Autostart: enable
# ls /etc/libvirt/qemu/autostart/example.xml
/etc/libvirt/qemu/autostart/example.xml
# virsh undefine example
Domain example has been undefined
# virsh define example.xml
Domain example defined from example.xml
# virsh dominfo example | grep Autostart
Autostart: enable
# virsh autostart example
Domain example marked as autostarted
# ls /etc/libvirt/qemu/autostart/example.xml
ls: cannot access /etc/libvirt/qemu/autostart/example.xml: No such file or directory
This commit ensures dom->autostart is cleared whenever the config and
autostart link (if present) are removed.
The bridge network driver cleared this flag itself in networkUndefine.
This commit moves this into virNetworkDeleteConfig for symmetry with
virDomainDeleteConfig, and to ensure it is not missed in future network
drivers.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Error messages are already set in all code paths returning -1 from
networkGetNetworkAddress, so we don't want to overwrite them.
Signed-off-by: Luyao Huang <lhuang@redhat.com>
Signed-off-by: John Ferlan <jferlan@redhat.com>
When creating qemu capabilities, a dummy virDomainObj is created just
because our monitor code expects that. However, the object is created
locked already. Then, under cleanup label, we simply unref the object
which results in whole domain object to be disposed. The object lock
is destroyed subsequently, but hey - it's still locked:
==24845== Thread #14's call to pthread_mutex_destroy failed
==24845== with error code 16 (EBUSY: Device or resource busy)
==24845== at 0x4C3024E: pthread_mutex_destroy (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==24845== by 0x531F72E: virMutexDestroy (virthread.c:83)
==24845== by 0x5302977: virObjectLockableDispose (virobject.c:237)
==24845== by 0x5302A89: virObjectUnref (virobject.c:265)
==24845== by 0x1DD37866: virQEMUCapsInitQMP (qemu_capabilities.c:3397)
==24845== by 0x1DD37CC6: virQEMUCapsNewForBinary (qemu_capabilities.c:3481)
==24845== by 0x1DD381E2: virQEMUCapsCacheLookup (qemu_capabilities.c:3609)
==24845== by 0x1DD30F8A: virQEMUCapsInitGuest (qemu_capabilities.c:744)
==24845== by 0x1DD31889: virQEMUCapsInit (qemu_capabilities.c:1020)
==24845== by 0x1DD7DD36: virQEMUDriverCreateCapabilities (qemu_conf.c:888)
==24845== by 0x1DDC57C0: qemuStateInitialize (qemu_driver.c:803)
==24845== by 0x53DC743: virStateInitialize (libvirt.c:777)
==24845==
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
in virDomainFSInfoFree(), don't free the virDomainFSInfo data.
==10670== 80 bytes in 2 blocks are definitely lost in loss record 576 of 793
==10670== at 0x4A06BC3: calloc (vg_replace_malloc.c:618)
==10670== by 0x509DEBD: virAlloc (viralloc.c:144)
==10670== by 0x19FBD558: qemuAgentGetFSInfo (qemu_agent.c:1837)
==10670== by 0x1A03CF91: qemuDomainGetFSInfo (qemu_driver.c:19238)
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Commit 4bbe1029f fixed a problem in commit f7afeddc by moving the call
to virNetDevGetIndex() to a location common to all interface types (so
that the nicindex array would be filled in for macvtap as well as tap
interfaces), but the location was *too* common, as the original call
to virNetDevGetIndex() had been in a section qualified by "if
(cfg->privileged)". The result was that the "fixed" libvirtd would try
to call virNetDevGetIndex() even for session mode libvirtd, and end up
failing with the log message:
Unable to open control socket: Operation not permitted
To remedy that, this patch qualifies the call to virNetDevGetIndex()
in its new location with cfg->privileged.
This resolves https://bugzilla.redhat.com/show_bug.cgi?id=1198244
As of bba93d40 all of our RPC objects are derived from
virObjectLockable. However, during rewrite some errors sneaked
in. For instance, the dispose functions to virNetClient and
virNetServerClient objects were not only freeing allocated
memory, but unlocking themselves. This is wrong. Object should
never disappear while locked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Well, one day this will be self-locking object, but not today.
But lets prepare the code for that! Moreover,
virNetworkObjListFree() is no longer needed, so turn it into
virNetworkObjListDispose().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>