linux-kernel - Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87c5514b-4971-b283-912c-573ab1b4d636@linux.ibm.com>
Date:   Tue, 12 Jul 2022 10:47:51 +0200
From:   Janis Schoetterl-Glausch <scgl@...ux.ibm.com>
To:     Pierre Morel <pmorel@...ux.ibm.com>, kvm@...r.kernel.org
Cc:     linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        borntraeger@...ibm.com, frankja@...ux.ibm.com, cohuck@...hat.com,
        david@...hat.com, thuth@...hat.com, imbrenda@...ux.ibm.com,
        hca@...ux.ibm.com, gor@...ux.ibm.com, wintera@...ux.ibm.com,
        seiden@...ux.ibm.com, nrb@...ux.ibm.com
Subject: Re: [PATCH v12 3/3] KVM: s390: resetting the Topology-Change-Report

On 7/12/22 09:24, Pierre Morel wrote:
> 
> 
> On 7/11/22 15:22, Janis Schoetterl-Glausch wrote:
>> On 7/11/22 10:41, Pierre Morel wrote:
>>> During a subsystem reset the Topology-Change-Report is cleared.
>>>
>>> Let's give userland the possibility to clear the MTCR in the case
>>> of a subsystem reset.
>>>
>>> To migrate the MTCR, we give userland the possibility to
>>> query the MTCR state.
>>>
>>> We indicate KVM support for the CPU topology facility with a new
>>> KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@...ux.ibm.com>
>>
>> Reviewed-by: Janis Schoetterl-Glausch <scgl@...ux.ibm.com>
>>
> 
> Thanks!
> 
>> See nits/comments below.
>>
>>> ---
>>>   Documentation/virt/kvm/api.rst   | 25 ++++++++++++++
>>>   arch/s390/include/uapi/asm/kvm.h |  1 +
>>>   arch/s390/kvm/kvm-s390.c         | 56 ++++++++++++++++++++++++++++++++
>>>   include/uapi/linux/kvm.h         |  1 +
>>>   4 files changed, 83 insertions(+)
>>>
>>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>>> index 11e00a46c610..5e086125d8ad 100644
>>> --- a/Documentation/virt/kvm/api.rst
>>> +++ b/Documentation/virt/kvm/api.rst
>>> @@ -7956,6 +7956,31 @@ should adjust CPUID leaf 0xA to reflect that the PMU is disabled.
>>>   When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of
>>>   type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request.
>>>   +8.37 KVM_CAP_S390_CPU_TOPOLOGY
>>> +------------------------------
>>> +
>>> +:Capability: KVM_CAP_S390_CPU_TOPOLOGY
>>> +:Architectures: s390
>>> +:Type: vm
>>> +
>>> +This capability indicates that KVM will provide the S390 CPU Topology
>>> +facility which consist of the interpretation of the PTF instruction for
>>> +the function code 2 along with interception and forwarding of both the
>>> +PTF instruction with function codes 0 or 1 and the STSI(15,1,x)
>>
>> Is the architecture allowed to extend STSI without a facility?
>> If so, if we say here that STSI 15.1.x is passed to user space, then
>> I think we should have a
>>
>> if (sel1 != 1)
>>     goto out_no_data;
>>
>> or maybe even
>>
>> if (sel1 != 1 || sel2 < 2 || sel2 > 6)
>>     goto out_no_data;
>>
>> in priv.c
> 
> I am not a big fan of doing everything in the kernel.
> Here we have no performance issue since it is an error of the guest if it sends a wrong selector.
> 
I agree, but I didn't suggest it for performance reasons.
I was thinking about future proofing, that is if the architecture is extended.
We don't know if future extensions are best handled in the kernel or user space,
so if we prevent it from going to user space, we can defer the decision to when we know more.
But that's only relevant if STSI can be extended without a capability, which is why I asked about that.

> Even testing the facility or PV in the kernel is for my opinion arguable in the case we do not do any treatment in the kernel.
> 
> I do not see what it brings to us, it increase the LOCs and makes the implementation less easy to evolve.
> 
> 
>>
>>> +instruction to the userland hypervisor.
>>> +
>>> +The stfle facility 11, CPU Topology facility, should not be indicated
>>> +to the guest without this capability.
>>> +
>>> +When this capability is present, KVM provides a new attribute group
>>> +on vm fd, KVM_S390_VM_CPU_TOPOLOGY.
>>> +This new attribute allows to get, set or clear the Modified Change
>>
>> get or set, now that there is no explicit clear anymore.
> 
> Yes now it is a set to 0 but the action of clearing remains.
> 
>>
>>> +Topology Report (MTCR) bit of the SCA through the kvm_device_attr
>>> +structure.> +
>>> +When getting the Modified Change Topology Report value, the attr->addr
>>
>> When getting/setting the...
>>
>>> +must point to a byte where the value will be stored.
>>
>> ... will be stored/retrieved from.
> 
> OK

Wait no, I didn't get how that works. You're passing the value via attr->attr, not reading it from addr.
> 
> 
>>> +
>>>   9. Known KVM API problems
>>>   =========================
>>>   diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
>>> index 7a6b14874d65..a73cf01a1606 100644
>>> --- a/arch/s390/include/uapi/asm/kvm.h
>>> +++ b/arch/s390/include/uapi/asm/kvm.h
>>> @@ -74,6 +74,7 @@ struct kvm_s390_io_adapter_req {
>>>   #define KVM_S390_VM_CRYPTO        2
>>>   #define KVM_S390_VM_CPU_MODEL        3
>>>   #define KVM_S390_VM_MIGRATION        4
>>> +#define KVM_S390_VM_CPU_TOPOLOGY    5
>>>     /* kvm attributes for mem_ctrl */
>>>   #define KVM_S390_VM_MEM_ENABLE_CMMA    0
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 70436bfff53a..b18e0b940b26 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -606,6 +606,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>>>       case KVM_CAP_S390_PROTECTED:
>>>           r = is_prot_virt_host();
>>>           break;
>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>> +        r = test_facility(11);
>>> +        break;
>>>       default:
>>>           r = 0;
>>>       }
>>> @@ -817,6 +820,20 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap)
>>>           icpt_operexc_on_all_vcpus(kvm);
>>>           r = 0;
>>>           break;
>>> +    case KVM_CAP_S390_CPU_TOPOLOGY:
>>> +        r = -EINVAL;
>>> +        mutex_lock(&kvm->lock);
>>> +        if (kvm->created_vcpus) {
>>> +            r = -EBUSY;
>>> +        } else if (test_facility(11)) {
>>> +            set_kvm_facility(kvm->arch.model.fac_mask, 11);
>>> +            set_kvm_facility(kvm->arch.model.fac_list, 11);
>>> +            r = 0;
>>> +        }
>>> +        mutex_unlock(&kvm->lock);
>>> +        VM_EVENT(kvm, 3, "ENABLE: CAP_S390_CPU_TOPOLOGY %s",
>>> +             r ? "(not available)" : "(success)");
>>> +        break;
>>>       default:
>>>           r = -EINVAL;
>>>           break;
>>> @@ -1717,6 +1734,36 @@ static void kvm_s390_update_topology_change_report(struct kvm *kvm, bool val)
>>>       read_unlock(&kvm->arch.sca_lock);
>>>   }
>>>   +static int kvm_s390_set_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>
>> kvm_s390_set_topology_changed maybe?
>> kvm_s390_get_topology_changed below then.
> 

I won't insist on it, but I do think it's more readable.

> No strong opinion, if you prefer I change this.
> 
>>
>>> +{
>>> +    if (!test_kvm_facility(kvm, 11))
>>> +        return -ENXIO;
>>> +
>>> +    kvm_s390_update_topology_change_report(kvm, !!attr->attr);
>>> +    return 0;
>>> +}
>>> +
>>> +static int kvm_s390_get_topology(struct kvm *kvm, struct kvm_device_attr *attr)
>>> +{
>>> +    union sca_utility utility;
>>> +    struct bsca_block *sca;
>>> +    __u8 topo;
>>> +
>>> +    if (!test_kvm_facility(kvm, 11))
>>> +        return -ENXIO;
>>> +
>>> +    read_lock(&kvm->arch.sca_lock);
>>> +    sca = kvm->arch.sca;
>>> +    utility.val = READ_ONCE(sca->utility.val);
>>
>> I don't think you need the READ_ONCE anymore, now that there is a lock it should act as a compile barrier.
> 
> I think you are right.
> 
>>> +    read_unlock(&kvm->arch.sca_lock);
>>> +    topo = utility.mtcr;
>>> +
>>> +    if (copy_to_user((void __user *)attr->addr, &topo, sizeof(topo)))
>>
>> Why void not u8?
> 
> I like to say we write on "topo" with the size of "topo".
> So we do not need to verify the effective size of topo.
> But I understand, it is a UAPI, setting u8 in the copy_to_user makes sense too.
> For my personal opinion, I would have prefer that userland tell us the size it awaits even here, for this special case, since we use a byte, we can not do really wrong.
You're right, it doesn't make a difference.
What about doing put_user(topo, (u8 *)attr->addr)), seems more straight forward.
> 
>>
>>> +        return -EFAULT;
>>> +
>>> +    return 0;
>>> +}
>>> +
>> [...]
>>
>