Commit Graph

276 Commits

Author SHA1 Message Date
Peter Krempa
f428ff8ad4 qemu: Add missing 'p' to qemuCgrouEmulatorAllNodesRestore 2016-09-13 12:24:02 +02:00
Peter Krempa
eb5dee3534 qemu: cgroup: Extract temporary relaxing of cgroup setting for vcpu hotplug
When hot-adding vcpus qemu needs to allocate some structures in the DMA
zone which may be outside of the numa pinning. Extract the code doing
this in a set of helpers so that it can be reused.
2016-09-07 16:05:01 +02:00
Peter Krempa
c7d5dd3974 conf: Rename virDomainVcpuInfoPtr to virDomainVcpuDefPtr 2016-07-11 09:06:09 +02:00
Ján Tomko
d033d4762f Revert "qemu_cgroup: allow access to /dev/dri for virtio-vga"
This reverts commit 3943bdd60c.
2016-05-23 10:48:27 +02:00
Ján Tomko
3943bdd60c qemu_cgroup: allow access to /dev/dri for virtio-vga
QEMU needs access to the /dev/dri/render* device for
virgl to work.

Allow access to all /dev/dri/* devices for domains with
<video>
  <model type='virtio' heads='1' primary='yes'>
    <acceleration accel3d='yes'/>
  </model>
</video>

https://bugzilla.redhat.com/show_bug.cgi?id=1337290
2016-05-19 10:52:50 +02:00
Martin Kletzander
16b41728b5 qemu: Free priv->machineName
Commit c3bd0019c0 forgot to cleanup after itself.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1325043

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-04-11 11:46:09 +02:00
Alexander Burluka
ef1fa55e46 Implement qemuSetupGlobalCpuCgroup
This functions setups per-domain cpu bandwidth parameters

Signed-off-by: Alexander Burluka <aburluka@virtuozzo.com>
2016-03-01 14:30:11 +00:00
Peter Krempa
a06ef20782 qemu: process: Move emulator thread setting code into one function
Similarly to the refactors to iothreads and vcpus, move the code that
initializes the emulator thread settings into single function.
2016-03-01 14:07:27 +00:00
Bjoern Walk
65c4c7d850 qemu: cgroup: fix cgroup permission logic
Fix logic error introduced in commit d6c91b3c which essentially broke
starting any domain.

Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
2016-02-18 10:32:46 +01:00
Peter Krempa
d1242ba24a qemu: cgroup: Setup cgroups for bios/firmware images
oVirt wants to use OVMF images on top of lvm for their 'logical'
storage thus we should set up device ACLs for them so it will actually
work.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1305922
2016-02-17 12:29:00 +01:00
Peter Krempa
d6c91b3c03 qemu: cgroup: Extract guts of qemuSetupImageCgroupInternal
They will later be reused for setting cgroup for other image backed
devices.
2016-02-17 10:54:05 +01:00
Peter Krempa
2b15f2a196 qemu: cgroup: Split up qemuSetImageCgroupInternal
Separate the Teardown and Setup code paths into separate helpers.
2016-02-17 10:54:05 +01:00
Peter Krempa
5dd610d01d qemu: cgroup: Switch to qemu(Setup|Teardown)ImageCgroup
For other objects we use the two functions rather than one with a bool.
Convert qemuSetImageCgroup to the same approach.
2016-02-17 10:54:05 +01:00
Peter Krempa
4e22355ee1 qemu: cgroup: Avoid reporting errors from inaccessible NFS volumes
Rather than reporting it and then reseting the error, don't report it in
the first place.
2016-02-17 10:54:05 +01:00
Peter Krempa
cf113e8d54 util: cgroup: Allow ignoring EACCES in virCgroup(Allow|Deny)DevicePath
When adding disk images to ACL we may call those functions on NFS
shares. In that case we might get an EACCES, which isn't really relevant
since NFS would not hold a block device. This patch adds a flag that
allows to stop reporting an error on EACCES to avoid spaming logs.

Currently there's no functional change.
2016-02-17 10:54:05 +01:00
Peter Krempa
9cd5da710e util: cgroup: Drop virCgroup(Allow|Deny)DeviceMajor
Since commit 47e5b5ae virCgroupAllowDevice allows to pass -1 as either
the minor or major device number and it automatically uses '*' in place
of that. Reuse the new approach through the code and drop the duplicated
functions.
2016-02-17 10:54:05 +01:00
Peter Krempa
21212fca13 qemu: cgroup: Remove abandoned function qemuAddToCgroup
This function doesn't do anything useful since 2049ef9942.
2016-02-17 10:28:34 +01:00
Peter Krempa
1dcc4c7ffd qemu: iothread: Aggregate code to set IOThread tuning
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting IOThread info on
hotplug.
2016-02-08 17:05:00 +01:00
Peter Krempa
56971667ee qemu: vcpu: Aggregate code to set vCPU tuning
Rather than iterating 3 times for various settings this function
aggregates all the code into single place. One of the other advantages
is that it can then be reused for properly setting vCPU info on hotplug.

With this approach autoCpuset is also used when setting the process
affinity rather than just via cgroups.
2016-02-08 17:05:00 +01:00
Peter Krempa
d2a6fc79e3 conf: Store cpu pinning data in def->vcpus
Now with the new struct the data can be stored in a much saner place.
2016-02-08 09:51:34 +01:00
Martin Kletzander
c3bd0019c0 systemd: Modernize machine naming
So, systemd-machined has this philosophy that machine names are like
hostnames and hence should follow the same rules.  But we always allowed
international characters in domain names.  Thus we need to modify the
machine name we are passing to systemd.

In order to change some machine names that we will be passing to systemd,
we also need to call TerminateMachine at the end of a lifetime of a
domain.  Even for domains that were started with older libvirt.  That
can be achieved thanks to virSystemdGetMachineNameByPID().  And because
we can change machine names, we can get rid of the inconsistent and
pointless escaping of domain names when creating machine names.

So this patch modifies the naming in the following way.  It creates the
name as <drivername>-<id>-<name> where invalid hostname characters are
stripped out of the name and if the resulting name is longer, it
truncates it to 64 characters.  That way we can start domains we
couldn't start before.  Well, at least on systemd.

To make it work all together, the machineName (which is needed only with
systemd) is saved in domain's private data.  That way the generation is
moved to the driver and we don't need to pass various unnecessary
arguments to cgroup functions.

The only thing this complicates a bit is the scope generation when
validating a cgroup where we must check both old and new naming, so a
slight modification was needed there.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1282846

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2016-02-05 16:11:50 +01:00
John Ferlan
d6d7e2885b cgroup: Fix possible bug as a result of code motion for vcpu cgroup setup
Commit id '90b721e43' moved where the virCgroupAddTask was made until
after the check for the vcpupin checks. However, in doing so it missed
an option where if the cpumap didn't exist, then the code would continue
back to the top of the current vcpu loop. The results was that the
virCgroupAddTask wouldn't be called.

Signed-off-by: John Ferlan <jferlan@redhat.com>
2016-01-14 11:02:53 -05:00
John Ferlan
d41bd09596 Revert "util: cgroups do not implicitly add task to new machine cgroup"
This reverts commit 71ce475967.

Since commit id 'a41c00b47' has been reverted, this no longer is
necessary
2016-01-14 11:00:25 -05:00
John Ferlan
f8f6907284 Revert "qemu: do not put a task into machine cgroup"
This reverts commit a41c00b472.

After much testing and upstream discussion this has been deemed to be
the incorrect operation since it means we no longer have any guarantee
about which resource controllers the QEMU processes in general are in.
2016-01-14 10:56:53 -05:00
Henning Schild
90b721e43e qemu cgroups: move new threads to new cgroup after cpuset is set up
Moving tasks to cgroups implied sched_setaffinity. Changing the cpus in
a set implies the same for all tasks in the group.
The old code put the the thread into the cpuset inherited from the
machine cgroup, which allowed it to run outside of vcpupin for a short
while.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:58:05 -05:00
Henning Schild
a41c00b472 qemu: do not put a task into machine cgroup
The machine cgroup is a superset, a parent to the emulator and vcpuX
cgroups. The parent cgroup should never have any tasks directly in it.
In fact the parent cpuset might contain way more cpus than the sum of
emulatorpin and vcpupins. So putting tasks in the superset will allow
them to run outside of <cputune>.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:48:05 -05:00
Henning Schild
71ce475967 util: cgroups do not implicitly add task to new machine cgroup
virCgroupNewMachine used to add the pidleader to the newly created
machine cgroup. Do not do this implicit anymore.

Signed-off-by: Henning Schild <henning.schild@siemens.com>
2015-12-14 15:43:29 -05:00
Peter Krempa
8715120e4d qemu: cgroup: Don't use priv->ncpupids to iterate domain vCPUs
Use the proper data structures for the iteration since ncpupids will be
made private later.
2015-12-09 14:57:12 +01:00
Peter Krempa
e6b36736a8 qemu: Add helper to retrieve vCPU pid
Instead of directly accessing the array add a helper to do this.
2015-12-09 14:57:12 +01:00
Peter Krempa
220a2d51de qemu: Replace checking for vcpu<->pid mapping availability with a helper
Add qemuDomainHasVCpuPids to do the checking and replace in place checks
with it.

We no longer need checking whether the thread contains fake data
(vcpupids[0] == vm->pid) as in b07f3d821d
and 65686e5a81 this was removed.
2015-12-09 14:57:12 +01:00
Peter Krempa
6ba02c21ac qemu: cgroup: Remove now unreachable check
Since commit 0c04906fa the check for priv->cgroup doesn't make sense as
the calls to virCgroupHasController return the same information. Remove
it and move it's comment partially to the new check.

The already spurious check was also later copied to the iothreads code.
2015-12-09 14:57:12 +01:00
Ján Tomko
1c00dcd665 qemu: add passed-through input devs to cgroup ACL
https://bugzilla.redhat.com/show_bug.cgi?id=1231114
2015-11-30 12:59:10 +01:00
Ján Tomko
eebe58adeb qemuSetupChrSourceCgroup: rename dev to source
We do not have a pointer to the device here, just its source.
2015-11-23 13:52:18 +01:00
Ján Tomko
b8286f0666 Simplify qemuSetupChrSourceCgroup and its callers
The domain definition is not needed in any of these functions.
Only pass it to qemuSetupChardevCgroup, which is used as a callback
for virDomainChrDefForeach.

Use the right type for passing virDomainObjPtr instead of
void* where possible.
2015-11-23 13:52:18 +01:00
Ján Tomko
b57ce788a7 rename qemuSetupHostdevCGroup to qemuSetupHostdevCgroup
Change CGroup to Cgroup to match other functions in the file.
2015-11-23 13:52:18 +01:00
John Ferlan
10604cb8c5 qemu: Check for niothreads == 0 in qemuSetupCgroupForIOThreads
If there are no IOThreads defined, no sense making other checks
2015-10-16 06:49:19 -04:00
Jiri Denemark
cda2afac79 qemuDomainEventQueue: Check if event is non-NULL
Every single call to qemuDomainEventQueue() uses the following pattern:

    if (event)
        qemuDomainEventQueue(driver, event);

Let's move the check for valid event to qemuDomainEventQueue and
simplify all callers.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
2015-09-18 13:50:03 +02:00
Martin Kletzander
7b5acf9461 qemu: Sync BlkioDevice values when setting them in cgroups
The problem here is that there are some values that kernel accepts, but
does not set them, for example 18446744073709551615 which acts the same
way as zero.  Let's do the same thing we do with other tuning options
and re-read them right after they are set in order to keep our internal
structures up-to-date.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165580

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-08-18 16:27:43 -07:00
Luyao Huang
1439eb32af qemu: fix some api cannot work when disable cpuset in conf
If cpuset is disabled or not available, it libvirt must not use it.
Mainly for actions that do not need it and can use sched_setaffinity()
or numa_membind() instead, because they will fail without good reason.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1244664

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-08-03 13:08:00 +02:00
Peter Krempa
88f6c007c3 cgroup: Drop resource partition from virSystemdMakeScopeName
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.

This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.

The error is exposed by the following log entry:

debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'

for a "/machine/test" resource and "testvm" vm.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1238570
2015-07-22 07:12:56 +02:00
Peter Krempa
0b416434f8 qemu: 'privileged' flag is not really configuration
The privileged flag will not change while the configuration might
change. Make the 'privileged' flag member of the driver again and mark
it immutable. Should that ever change add an accessor that will group
reads of the state.
2015-06-18 15:13:45 +02:00
Peter Krempa
ee3da892f2 conf: Refactor emulatorpin handling
Store the emulator pinning cpu mask as a pure virBitmap rather than the
virDomainPinDef since it stores only the bitmap and refactor
qemuDomainPinEmulator to do the same operations in a much saner way.

As a side effect virDomainEmulatorPinAdd and virDomainEmulatorPinDel can
be removed since they don't add any value.
2015-06-03 09:42:07 +02:00
Michal Privoznik
bcd9a564b6 virDomainNumatuneGetMode: Report if numatune was defined
So far, we are not reporting if numatune was even defined. The
value of zero is blindly returned (which maps onto
VIR_DOMAIN_NUMATUNE_MEM_STRICT). Unfortunately, we are making
decisions based on this value. Instead, we should not only return
the correct value, but report to the caller if the value is valid
at all.

For better viewing of this patch use '-w'.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-05-20 14:02:25 +02:00
John Ferlan
b266486fb9 Move iothreadspin information into iothreadids
Remove the iothreadspin array from cputune and replace with a cpumask
to be stored in the iothreadids list.

Adjust the test output because our printing goes in order of the iothreadids
list now.
2015-04-27 12:36:35 -04:00
John Ferlan
8d4614a512 qemu: Use domain iothreadids to IOThread's 'thread_id'
Add 'thread_id' to the virDomainIOThreadIDDef as a means to store the
'thread_id' as returned from the live qemu monitor data.

Remove the iothreadpids list from _qemuDomainObjPrivate and replace with
the new iothreadids 'thread_id' element.

Rather than use the default numbering scheme of 1..number of iothreads
defined for the domain, use the iothreadid's list for the iothread_id

Since iothreadids list keeps track of the iothread_id's, these are
now used in place of the many places where a for loop would "know"
that the ID was "+ 1" from the array element.

The new tests ensure usage of the <iothreadid> values for an exact number
of iothreads and the usage of a smaller number of <iothreadid> values than
iothreads that exist (and usage of the default numbering scheme).
2015-04-27 12:36:35 -04:00
Peter Krempa
5a35b2e599 qemu: cgroup: Fix priorities when setting emulatorpin
Use the custom emulator pin setting with the highest priority same as
with vcpupin.
2015-04-24 09:59:38 +02:00
John Ferlan
0456eda317 cgroup: Use virCgroupNewThread
Replace the virCgroupNew{Vcpu|Emulator|IOThread} calls with the common
virCgroupNewThread API

Signed-off-by: John Ferlan <jferlan@redhat.com>
2015-04-09 19:27:08 -04:00
Luyao Huang
7cd0cf05f7 fix memleak in qemuRestoreCgroupState
131,088 bytes in 16 blocks are definitely lost in loss record 2,174 of 2,176
    at 0x4C29BFD: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
    by 0x4C2BACB: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
    by 0x52A026F: virReallocN (viralloc.c:245)
    by 0x52BFCB5: saferead_lim (virfile.c:1268)
    by 0x52C00EF: virFileReadLimFD (virfile.c:1328)
    by 0x52C019A: virFileReadAll (virfile.c:1351)
    by 0x52A5D4F: virCgroupGetValueStr (vircgroup.c:763)
    by 0x1DDA0DA3: qemuRestoreCgroupState (qemu_cgroup.c:805)
    by 0x1DDA0DA3: qemuConnectCgroup (qemu_cgroup.c:857)
    by 0x1DDB7BA1: qemuProcessReconnect (qemu_process.c:3694)
    by 0x52FD171: virThreadHelper (virthread.c:206)
    by 0x82B8DF4: start_thread (pthread_create.c:308)
    by 0x85C31AC: clone (clone.S:113)

Signed-off-by: Luyao Huang <lhuang@redhat.com>
2015-04-08 11:56:30 +02:00
Michal Privoznik
225aa80246 virQEMUDriverGetConfig: Fix memleak
==19015== 968 (416 direct, 552 indirect) bytes in 1 blocks are definitely lost in loss record 999 of 1,049
==19015==    at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015==    by 0x52ADF14: virAllocVar (viralloc.c:560)
==19015==    by 0x5302FD1: virObjectNew (virobject.c:193)
==19015==    by 0x1DD9401E: virQEMUDriverConfigNew (qemu_conf.c:164)
==19015==    by 0x1DDDF65D: qemuStateInitialize (qemu_driver.c:666)
==19015==    by 0x53E0823: virStateInitialize (libvirt.c:777)
==19015==    by 0x11E067: daemonRunStateInit (libvirtd.c:905)
==19015==    by 0x53201AD: virThreadHelper (virthread.c:206)
==19015==    by 0xA1EE1F2: start_thread (in /lib64/libpthread-2.19.so)
==19015==    by 0xA4EFC8C: clone (in /lib64/libc-2.19.so)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-04-07 18:52:27 +02:00
Michal Privoznik
9dbe6f3151 qemuSetupCgroupForVcpu: Fix memleak
==19015== 1,064 (656 direct, 408 indirect) bytes in 2 blocks are definitely lost in loss record 1,002 of 1,049
==19015==    at 0x4C2C070: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==19015==    by 0x52AD74B: virAlloc (viralloc.c:144)
==19015==    by 0x52B47CA: virCgroupNew (vircgroup.c:1057)
==19015==    by 0x52B53E5: virCgroupNewVcpu (vircgroup.c:1451)
==19015==    by 0x1DD85A40: qemuSetupCgroupForVcpu (qemu_cgroup.c:1013)
==19015==    by 0x1DDA66EA: qemuProcessStart (qemu_process.c:4844)
==19015==    by 0x1DDF1807: qemuDomainObjStart (qemu_driver.c:7265)
==19015==    by 0x1DDF1A66: qemuDomainCreateWithFlags (qemu_driver.c:7320)
==19015==    by 0x1DDF1ACD: qemuDomainCreate (qemu_driver.c:7337)
==19015==    by 0x53F87EA: virDomainCreate (libvirt-domain.c:6820)
==19015==    by 0x12690A: remoteDispatchDomainCreate (remote_dispatch.h:3481)
==19015==    by 0x126827: remoteDispatchDomainCreateHelper (remote_dispatch.h:3457)

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2015-04-07 18:52:26 +02:00
Peter Krempa
6afb0d04fe qemu: cgroup: Kill qemuSetupCgroupVcpuPin()
The function doesn't make sense. There's a simpler way to achieve the
same.
2015-04-02 10:12:08 +02:00
Peter Krempa
8a81264b18 qemu: cgroup: Kill qemuSetupCgroupIOThreadsPin()
The function doesn't make sense. There's a simpler way to achieve the
same.
2015-04-02 10:12:08 +02:00
Peter Krempa
55072593d8 qemu: cgroup: Rename qemuSetupCgroupEmulatorPin to qemuSetupCgroupCpusetCpus
The function is used to set cpuset.cpus in various other helpers.
2015-04-02 10:12:08 +02:00
Peter Krempa
98f08aba8e qemu: cgroup: Use priv->autoCpuset instead of using qemuPrepareCpumap()
Two places would call to qemuPrepareCpumap() with priv->autoNodeset to
convert it to a cpuset. Remove the function and use the prepared cpuset
automatically.
2015-04-02 10:12:08 +02:00
Peter Krempa
f0fa9080d4 qemu: cgroup: Properly set up vcpu pinning
When the default cpuset or automatic numa placement is used libvirt
would place the whole parent cgroup in the specified cpuset. This then
disallowed to re-pin the vcpus to a different cpu.

This patch pins only the vcpu threads to the default cpuset and thus
allows to re-pin them later.

The following config would fail to start:
<domain type='kvm'>
  ...
  <vcpu placement='static' cpuset='0-1' current='2'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='2-3'/>
    ...

This is a regression since a39f69d2b.
2015-04-02 10:12:08 +02:00
Peter Krempa
7095006921 qemu: cgroup: Refactor setup for IOThread cgroups
Use the default or auto cpuset if they are provided for IOThreads.
2015-04-02 10:12:08 +02:00
Peter Krempa
c9f9fa25d3 qemu: cgroup: Store auto cpuset instead of re-creating it on demand
The automatic cpuset can be stored along with automatic nodeset and it
does not have to be recreated when used.
2015-04-02 10:12:08 +02:00
Martin Kletzander
3a0e5b0c20 qemu: Migrate memory on numatune change
We've never set the cpuset.memory_migrate value to anything, keeping it
on default.  However, we allow changing cpuset.mems on live domain.
That setting, however, don't have any consequence on a domain unless
it's going to allocate new memory.

I managed to make 'virsh numatune' move all the memory to any node I
wanted even without disabling libnuma's numa_set_membind(), so this
should be safe to use with it as well.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1198497

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2015-03-20 13:40:02 +01:00
John Ferlan
a9f528ab29 Convert virDomainPinDefPtr->vcpuid to virDomainPinDefPtr->id
Since we're not specifically a vcpu related structure anymore...
2015-03-16 11:54:57 -04:00
John Ferlan
59ba70237a Convert virDomainVcpuPinDefPtr to virDomainPinDefPtr
As pointed out by jtomko in his review of the IOThreads pinning code:

http://www.redhat.com/archives/libvir-list/2015-March/msg00495.html

there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...

This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...
2015-03-16 11:54:56 -04:00
Pavel Hrdina
cf521fc8ba memtune: change the way how we store unlimited value
There was a mess in the way how we store unlimited value for memory
limits and how we handled values provided by user.  Internally there
were two possible ways how to store unlimited value: as 0 value or as
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED.  Because we chose to store memory
limits as unsigned long long, we cannot use -1 to represent unlimited.
It's much easier for us to say that everything greater than
VIR_DOMAIN_MEMORY_PARAM_UNLIMITED means unlimited and leave 0 as valid
value despite that it makes no sense to set limit to 0.

Remove unnecessary function virCompareLimitUlong.  The update of test
is to prevent the 0 to be miss-used as unlimited in future.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1146539

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-03-06 11:52:24 +01:00
Peter Krempa
6bc80fa86d conf: numa: Rename virDomainNumatune to virDomainNuma
The structure will gradually become the only place for NUMA related
config, thus rename it appropriately.
2015-02-20 17:43:04 +01:00
Pavel Hrdina
77a9dc0b8d qemu_cgroup: initialize mem_mask to NULL
If 'virNumaGetHostNodeset()' fails then the error path will try to free
uninitialized pointer mem_mask. Introduced by commit af2a1f058.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2015-02-17 14:22:50 +01:00
Daniel P. Berrange
f7afeddce9 qemu: report TAP device indexes to systemd
Record the index of each TAP device created and report them to
systemd, so they show up in machinectl status for the VM.
2015-01-27 13:57:02 +00:00
Daniel P. Berrange
318df5a05f Add support for systemd-machined CreateMachineWithNetwork
systemd-machined introduced a new method CreateMachineWithNetwork
that obsoletes CreateMachine. It expects to be given a list of
VETH/TAP device indexes for the host side device(s) associated
with a container/machine.

This falls back to the old CreateMachine method when the new
one is not supported.
2015-01-15 11:07:07 +00:00
Martin Kletzander
86759ec61a qemu: Add missing goto error in qemuRestoreCgroupState
Commit af2a1f05 tried clearly separating each condition in
qemuRestoreCgroupState() for the sake of readability, however somehow
one condition body was missing.  That means that the body of the next
condition got executed only if both of there were true, which is
impossible, thus resulting in a dead code and a logic error.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 20:44:33 +01:00
Martin Kletzander
af2a1f0587 qemu: Leave cpuset.mems in parent cgroup alone
Instead of setting the value of cpuset.mems once when the domain starts
and then re-calculating the value every time we need to change the child
cgroup values, leave the cgroup alone and rather set the child data
every time there is new cgroup created.  We don't leave any task in the
parent group anyway.  This will ease both current and future code.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Martin Kletzander
c74d58ad47 qemu: Save numad advice into qemuDomainObjPrivate
Thanks to that we don't need to drag the pointer everywhere and future
code will get cleaner.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Martin Kletzander
f801a81208 qemu: Remove unnecessary qemuSetupCgroupPostInit function
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-12-16 11:15:27 +01:00
Martin Kletzander
5cca4cd16f Remove unnecessary curly brackets in src/qemu/
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-11-14 17:13:01 +01:00
Wang Rui
c6e9024867 qemu: fix domain startup failing with 'strict' mode in numatune
If the memory mode is specified as 'strict' and with one node, we
get the following error when starting domain.

error: Unable to write to '$cgroup_path/cpuset.mems': Device or resource busy

XML is configured with numatune as follows:
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>

It's broken by Commit 411cea638f
which moved qemuSetupCgroupForEmulator() before setting cpuset.mems
in qemuSetupCgroupPostInit.

Directory '$cgroup_path/emulator/' is created in qemuSetupCgroupForEmulator.
But '$cgroup_path/emulator/cpuset.mems' it not set and has a default value
(all nodes, such as 0-1). Then we setup '$cgroup_path/cpuset.mems' to the
nodemask (in this case it's '0') in qemuSetupCgroupPostInit. It must fail.

This patch makes '$cgroup_path/emulator/cpuset.mems' is set before
'$cgroup_path/cpuset.mems'. The action is similar with that in
qemuDomainSetNumaParamsLive.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-11-11 12:14:09 +01:00
Wang Rui
38a0f6df64 qemu: don't setup cpuset.mems if memory mode in numatune is not 'strict'
If the memory mode in numatune is specified as 'preferred' with one node
(such as nodeset='0'), domain's memory is not all in node 0 absolutely.
Assumption that node 0 doesn't have enough memory, memory can be allocated
on node 1 when qemu process startup. Then if we set cpuset.mems to '0',
it may invoke OOM.

Commit 1a7be8c600 changed the former logic of
checking memory mode in virDomainNumatuneGetNodeset. This patch adds the
check as before.

Signed-off-by: Wang Rui <moon.wangrui@huawei.com>
2014-11-11 12:14:09 +01:00
Martin Kletzander
9661ac2f46 qemu: unref cfg after TerminateMachine has been called
Commit 4882618ed1 added the code that
requests driver cfg, but forgot to unref it.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-10-21 13:54:09 +02:00
Guido Günther
4882618ed1 qemu: use systemd's TerminateMachine to kill all processes
If we don't properly clean up all processes in the
machine-<vmname>.scope systemd won't remove the cgroup and subsequent vm
starts fail with

  'CreateMachine: File exists'

Additional processes can e.g. be added via

  echo $PID > /sys/fs/cgroup/systemd/machine.slice/machine-${VMNAME}.scope/tasks

but there are other cases like

  http://bugs.debian.org/761521

Invoke TerminateMachine to be on the safe side since systemd tracks the
cgroup anyway. This is a noop if all processes have terminated already.
2014-10-01 20:17:46 +02:00
Ján Tomko
e26bbf49cc Fix crash cpu_shares change event crash on domain startup
Introduced by commit 0dce260.

qemuDomainEventQueue was called with qemuDomainObjPrivatePtr instead
of virQEMUDriverPtr.

https://bugzilla.redhat.com/show_bug.cgi?id=1147494
2014-09-29 13:58:43 +02:00
Daniel P. Berrange
0778c0be8d Rename tunable event constants
For the new VIR_DOMAIN_EVENT_ID_TUNABLE event we have a bunch of
constants added

   VIR_DOMAIN_EVENT_CPUTUNE_<blah>
   VIR_DOMAIN_EVENT_BLKDEVIOTUNE_<blah>

This naming convention is bad for two reasons

  - There is no common prefix unique for the events to both
    relate them, and distinguish them from other event
    constants

  - The values associated with the constants were chosen
    to match the names used with virConnectGetAllDomainStats
    so having EVENT in the constant name is not applicable in
    that respect

This patch proposes renaming the constants to

    VIR_DOMAIN_TUNABLE_CPU_<blah>
    VIR_DOMAIN_TUNABLE_BLKDEV_<blah>

ie, given them a common VIR_DOMAIN_TUNABLE prefix.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2014-09-26 10:58:15 +01:00
Pavel Hrdina
0dce260cc8 cputune_event: queue the event for cputune updates
Now we have universal tunable event so we can use it for reporting
changes to user. The cputune values will be prefixed with "cputune" to
distinguish it from other tunable events.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
2014-09-23 21:58:09 +02:00
Ján Tomko
c1480871bb Fixes for domains with no iothreads
Plug a memory leak and silence a warning.
2014-09-18 14:49:01 +02:00
John Ferlan
500c91c57d qemu_cgroup: Adjust spacing around incrementor
Change "i+1" to "i + 1"
2014-09-15 21:05:46 -04:00
John Ferlan
5f6ad32c73 qemu_cgroup: Introduce cgroup functions for IOThreads
In order to support cpuset setting, introduce qemuSetupCgroupIOThreadsPin
and qemuSetupCgroupForIOThreads to mimic the existing Vcpu API's.

These will support having an 'iotrhreadpin' element in the 'cpuset' in
order to pin named IOThreads to specific CPU's. The IOThread pin names
will follow the IOThread naming scheme starting at 1 (eg "iothread1")
up through an including the def->iothreads value.
2014-09-15 13:18:56 -04:00
Peter Krempa
1c6999d340 conf: RNG: Always fill in default random source path for default backend
Libvirt documents that the default entropy source for the 'random'
backend of a RNG device is /dev/random. Instead of storing and
propagating NULL across our code and checking it in multiple places fill
the default in the post parse callback and use that in the other places.
2014-07-28 10:07:09 +02:00
Peter Krempa
bbddbefa2f virtio-rng: allow multiple RNG devices
qemu supports adding multiple RNG devices. This patch allows libvirt to
support this.
2014-07-25 09:34:53 +02:00
Peter Krempa
99ff49eed1 qemu: cgroup: Don't use NULL path on default backed RNGs
The "random" backend for virtio-rng can be started with no path
specified which equals to /dev/random. The cgroup code didn't consider
this and called few of the functions with NULL resulting into:

 $ virsh start rng-vm
 error: Failed to start domain rng-vm
 error: Path '(null)' is not accessible: Bad address

Problem introduced by commit c6320d3463
2014-07-25 09:34:53 +02:00
John Ferlan
17bddc46f4 hostdev: Introduce virDomainHostdevSubsysSCSIiSCSI
Create the structures and API's to hold and manage the iSCSI host device.
This extends the 'scsi_host' definitions added in commit id '5c811dce'.
A future patch will add the XML parsing, but that code requires some
infrastructure to be in place first in order to handle the differences
between a 'scsi_host' and an 'iSCSI host' device.
2014-07-24 07:04:44 -04:00
John Ferlan
42957661dc hostdev: Introduce virDomainHostdevSubsysSCSIHost
Split virDomainHostdevSubsysSCSI further. In preparation for having
either SCSI or iSCSI data, create a union in virDomainHostdevSubsysSCSI
to contain just a virDomainHostdevSubsysSCSIHost to describe the
'scsi_host' host device
2014-07-24 06:39:28 -04:00
John Ferlan
5805621cd9 hostdev: Introduce virDomainHostdevSubsysSCSI
Create a separate typedef for the hostdev union data describing SCSI
Then adjust the code to use the new pointer
2014-07-24 06:39:27 -04:00
John Ferlan
1c8da0d44e hostdev: Introduce virDomainHostdevSubsysPCI
Create a separate typedef for the hostdev union data describing PCI.
Then adjust the code to use the new pointer
2014-07-24 06:39:27 -04:00
John Ferlan
7540d07f09 hostdev: Introduce virDomainHostdevSubsysUSB
Create a separate typedef for the hostdev union data describing USB.
Then adjust the code to use the new pointer
2014-07-24 06:39:27 -04:00
Martin Kletzander
7e72ac7878 qemu: leave restricting cpuset.mems after initialization
When domain is started with numatune memory mode strict and the
nodeset does not include host NUMA node with DMA and DMA32 zones, KVM
initialization fails.  This is because cgroup restrict even kernel
allocations.  We are already doing numa_set_membind() which does the
same thing, only it does not restrict kernel allocations.

This patch leaves the userspace numa_set_membind() in place and moves
the cpuset.mems setting after the point where monitor comes up, but
before vcpu and emulator sub-groups are created.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
aa668fccf0 qemu: split out cpuset.mems setting
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:46 +02:00
Martin Kletzander
1a7be8c600 numatune: add support for per-node memory bindings in private APIs
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
93e82727ec numatune: Encapsulate numatune configuration in order to unify results
There were numerous places where numatune configuration (and thus
domain config as well) was changed in different ways.  On some
places this even resulted in persistent domain definition not to be
stable (it would change with daemon's restart).

In order to uniformly change how numatune config is dealt with, all
the internals are now accessible directly only in numatune_conf.c and
outside this file accessors must be used.

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
e764ec7ae3 numatune: unify numatune struct and enum names
Since there was already public virDomainNumatune*, I changed the
private virNumaTune to match the same, so all the uses are unified and
public API is kept:

s/vir\(Domain\)\?Numa[tT]une/virDomainNumatune/g

then shrunk long lines, and mainly functions, that were created after
that:

sed -i 's/virDomainNumatuneMemPlacementMode/virDomainNumatunePlacement/g'

And to cope with the enum name, I haad to change the constants as
well:

s/VIR_NUMA_TUNE_MEM_PLACEMENT_MODE/VIR_DOMAIN_NUMATUNE_PLACEMENT/g

Last thing I did was at least a little shortening of already long
name:

s/virDomainNumatuneDef/virDomainNumatune/g

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-16 20:15:45 +02:00
Martin Kletzander
0c04906fa8 qemu: don't error out when cgroups don't exist
When creating cgroups for vcpu and emulator threads whilst starting a
domain, we explicitly skip creating those cgroups in case priv->cgroup
is NULL (cgroups not supported) because SetAffinity() serves the same
purpose.  If the host supports only some cgroups (the ones we need are
either unmounted or disabled in qemu.conf), we error out with weird
message even though we could continue starting the domain.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1097028

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
2014-07-09 15:09:54 +02:00
Peter Krempa
1ba14d6df2 qemu: cgroup: Setup only the top level disk image for read-write access
Only the top level gets writes, so the rest of the backing chain
requires only read-only access.
2014-07-09 10:38:55 +02:00
Peter Krempa
aa53c77e1d qemu: cgroup: Add functions to set cgroup image stuff on individual imgs
Add functions that will allow to set all the required cgroup stuff on
individual images taking a virStorageSourcePtr. Also convert functions
designed to setup whole backing chain to take advantage of the change.
2014-07-09 10:38:55 +02:00
Peter Krempa
63834faadb storage: Move readonly and shared flags to disk source from disk def
In the future we might need to track state of individual images. Move
the readonly and shared flags to the virStorageSource struct so that we
can keep them in a per-image basis.
2014-07-08 14:27:19 +02:00
Ján Tomko
d4edce5f1e Always report an error if virBitmapFormat fails
It already reports an error if STRDUP fails.
2014-06-06 14:35:19 +02:00
Michal Privoznik
4dae1eddde qemuSetupCgroupForVcpu: s/virProcessInfoSetAffinity/virProcessSetAffinity/
In the f56c773bf we've made the substitution but forgot to fix one
comment which is still referring to the old name. This may be
potentially misleading.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2014-05-22 12:30:20 +02:00
Nehal J Wani
3d5c29a17c Fix typos in src/*
Fix minor typos in source comments

Signed-off-by: Eric Blake <eblake@redhat.com>
2014-04-21 16:49:08 -06:00