[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7910990d-d396-7dd9-c4fe-7029aa03f751@linux.ibm.com>
Date: Thu, 15 Jul 2021 13:31:55 +0200
From: Pierre Morel <pmorel@...ux.ibm.com>
To: Cornelia Huck <cohuck@...hat.com>,
David Hildenbrand <david@...hat.com>, kvm@...r.kernel.org
Cc: linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
borntraeger@...ibm.com, frankja@...ux.ibm.com, thuth@...hat.com,
imbrenda@...ux.ibm.com, hca@...ux.ibm.com, gor@...ux.ibm.com
Subject: Re: [PATCH v1 1/2] s390x: KVM: accept STSI for CPU topology
information
On 7/15/21 11:30 AM, Cornelia Huck wrote:
> On Thu, Jul 15 2021, David Hildenbrand <david@...hat.com> wrote:
>
>> On 14.07.21 17:25, Pierre Morel wrote:
>>> STSI(15.1.x) gives information on the CPU configuration topology.
>>> Let's accept the interception of STSI with the function code 15 and
>>> let the userland part of the hypervisor handle it.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@...ux.ibm.com>
>>> ---
>>> arch/s390/kvm/priv.c | 11 ++++++++++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>> index 9928f785c677..4ab5f8b7780e 100644
>>> --- a/arch/s390/kvm/priv.c
>>> +++ b/arch/s390/kvm/priv.c
>>> @@ -856,7 +856,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>> if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>> return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>>
>>> - if (fc > 3) {
>>> + if (fc > 3 && fc != 15) {
>>> kvm_s390_set_psw_cc(vcpu, 3);
>>> return 0;
>>> }
>>> @@ -893,6 +893,15 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>>> goto out_no_data;
>>> handle_stsi_3_2_2(vcpu, (void *) mem);
>>> break;
>>> + case 15:
>>> + if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>>> + goto out_no_data;
>>> + if (vcpu->kvm->arch.user_stsi) {
>>> + insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
>>> + return -EREMOTE;
>
> This bypasses the trace event further down.
>
Right, I can add a trace.
Note that we had no trace in the past.
>>> + }
>>> + kvm_s390_set_psw_cc(vcpu, 3);
>>> + return 0;
>>> }
>>> if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>>> memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>>>
>>
>> 1. Setting GPRS to 0
>>
>> I was wondering why we have the "vcpu->run->s.regs.gprs[0] = 0;"
>> for existing fc 1,2,3 in case we set cc=0.
>>
>> Looking at the doc, all I find is:
>>
>> "CC 0: Requested configuration-level number placed in
>> general register 0 or requested SYSIB informa-
>> tion stored"
>>
>> But I don't find where it states that we are supposed to set
>> general register 0 to 0. Wouldn't we also have to do it for
>> fc=15 or for none?
>>
>> If fc 1,2,3 and 15 are to be handled equally, I suggest the following:
>>
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 9928f785c677..6eb86fa58b0b 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -893,17 +893,23 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
>> goto out_no_data;
>> handle_stsi_3_2_2(vcpu, (void *) mem);
>> break;
>> + case 15:
>> + if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>> + goto out_no_data;
>> + break;
>> }
>> - if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>> - memcpy((void *)sida_origin(vcpu->arch.sie_block), (void *)mem,
>> - PAGE_SIZE);
>> - rc = 0;
>> - } else {
>> - rc = write_guest(vcpu, operand2, ar, (void *)mem, PAGE_SIZE);
>> - }
>> - if (rc) {
>> - rc = kvm_s390_inject_prog_cond(vcpu, rc);
>> - goto out;
>> + if (mem) {
>> + if (kvm_s390_pv_cpu_is_protected(vcpu)) {
>> + memcpy((void *)sida_origin(vcpu->arch.sie_block),
>> + (void *)mem, PAGE_SIZE);
>> + } else {
>> + rc = write_guest(vcpu, operand2, ar, (void *)mem,
>> + PAGE_SIZE);
>> + if (rc) {
>> + rc = kvm_s390_inject_prog_cond(vcpu, rc);
>> + goto out;
>> + }
>> + }
>> }
>> if (vcpu->kvm->arch.user_stsi) {
>> insert_stsi_usr_data(vcpu, operand2, ar, fc, sel1, sel2);
>
> Something like that sounds good, the code is getting a bit convoluted.
>
OK for me, in that case we can also suppress the check for FC=15 and let
that to user space as it was suggested in a previous comment
>>
>>
>> 2. maximum-MNest facility
>>
>> "
>> 1. If the maximum-MNest facility is installed and
>> selector 2 exceeds the nonzero model-depen-
>> dent maximum-selector-2 value."
>>
>> 2. If the maximum-MNest facility is not installed and
>> selector 2 is not specified as two.
>> "
>>
>> We will we be handling the presence/absence of the maximum-MNest facility
>> (for our guest?) in QEMU, corect?
>>
>> I do wonder if we should just let any fc=15 go to user space let the whole
>> sel1 / sel2 checking be handled there. I don't think it's a fast path after all.
>> But no strong opinion.
>
> If that makes handling easier, I think it would be a good idea.
OK too
>
>>
>> How do we identify availability of maximum-MNest facility?
>>
>>
>> 3. User space awareness
>>
>> How can user space identify that we actually forward these intercepts?
>> How can it enable them? The old KVM_CAP_S390_USER_STSI capability
>> is not sufficient.
>
> Why do you think that it is not sufficient? USER_STSI basically says
> "you may get an exit that tells you about a buffer to fill in some more
> data for a stsi command, and we also tell you which call". If userspace
> does not know what to add for a certain call, it is free to just do
> nothing, and if it does not get some calls it would support, that should
> not be a problem, either?
>
>>
>> I do wonder if we want KVM_CAP_S390_USER_STSI_15 or sth like that to change
>> the behavior once enabled by user space.
>>
>>
>> 4. Without vcpu->kvm->arch.user_stsi, we indicate cc=0 to our guest,
>> also for fc 1,2,3. Is that actually what we want? (or do we simply not care
>> because the guest is not supposed to use stsi?)
>
> If returning an empty buffer is ok, it should not be a problem, I
> guess. (I have not looked yet at the actual definitions.)
>
When user_stsi is 0 for fc 1,2,3 the buffer is filled in the kernel, for
15 the kernel can not do this.
--
Pierre Morel
IBM Lab Boeblingen
Powered by blists - more mailing lists