[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <243a53c0-a988-4013-6d04-a3dfdce8e3f0@linux.ibm.com>
Date: Thu, 19 May 2022 11:23:13 +0200
From: Pierre Morel <pmorel@...ux.ibm.com>
To: Christian Borntraeger <borntraeger@...ibm.com>, kvm@...r.kernel.org
Cc: linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
frankja@...ux.ibm.com, cohuck@...hat.com, david@...hat.com,
thuth@...hat.com, imbrenda@...ux.ibm.com, hca@...ux.ibm.com,
gor@...ux.ibm.com, wintera@...ux.ibm.com, seiden@...ux.ibm.com,
nrb@...ux.ibm.com, Viktor Mihajlovski <mihajlov@...ux.ibm.com>
Subject: Re: [PATCH v9 2/3] s390x: KVM: guest support for topology function
On 5/19/22 11:01, Christian Borntraeger wrote:
>
>
> Am 06.05.22 um 11:24 schrieb Pierre Morel:
>> We let the userland hypervisor know if the machine support the CPU
>> topology facility using a new KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>
>> The PTF instruction will report a topology change if there is any change
>> with a previous STSI_15_1_2 SYSIB.
>> Changes inside a STSI_15_1_2 SYSIB occur if CPU bits are set or clear
>> inside the CPU Topology List Entry CPU mask field, which happens with
>> changes in CPU polarization, dedication, CPU types and adding or
>> removing CPUs in a socket.
>>
>> The reporting to the guest is done using the Multiprocessor
>> Topology-Change-Report (MTCR) bit of the utility entry of the guest's
>> SCA which will be cleared during the interpretation of PTF.
>>
>> To check if the topology has been modified we use a new field of the
>> arch vCPU to save the previous real CPU ID at the end of a schedule
>> and verify on next schedule that the CPU used is in the same socket.
>> We do not report polarization, CPU Type or dedication change.
>
> I think we should not do this. When PTF returns with "has changed" the
> guest
> Linux will rebuild its schedule domains. And this is a really expensive
> operation as far as I can tell. And the host Linux scheduler WILL schedule
> too often to other CPUs. So in essence this will result in Linux guests
> rebuilding their scheduler domains all the time.
> So remove the "previous CPU logic" for now and only trigger an MTCR when
> userspace says so. (eg. on config changes). The idea was to have user
> defined schedule domains. Following host schedule decisions will be
> nearly impossible.
I guess you saw that the MTCR bit is set only if the previous and new
CPU are on different sockets, like it is on the hardware, not on every
scheduling to another CPU.
However this can easily be done in an enhancement, if ever, since it has
no implication on the UAPI.
I change this for the next round.
Thanks,
Pierre
--
Pierre Morel
IBM Lab Boeblingen
Powered by blists - more mailing lists