linux-kernel - Re: [PATCH v3 2/2] s390/mm: re-enable the shared zeropage for !PV and !skeys KVM guests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a6a4b284-e21b-4a04-88d1-7402eb5a08ef@redhat.com>
Date: Tue, 16 Apr 2024 15:41:21 +0200
From: David Hildenbrand <david@...hat.com>
To: Christian Borntraeger <borntraeger@...ux.ibm.com>,
 Alexander Gordeev <agordeev@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 Janosch Frank <frankja@...ux.ibm.com>,
 Claudio Imbrenda <imbrenda@...ux.ibm.com>, Heiko Carstens
 <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
 Andrew Morton <akpm@...ux-foundation.org>, Peter Xu <peterx@...hat.com>,
 Sven Schnelle <svens@...ux.ibm.com>,
 Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
 Andrea Arcangeli <aarcange@...hat.com>, kvm@...r.kernel.org,
 linux-s390@...r.kernel.org
Subject: Re: [PATCH v3 2/2] s390/mm: re-enable the shared zeropage for !PV and
 !skeys KVM guests

On 16.04.24 14:02, Christian Borntraeger wrote:
> 
> 
> Am 16.04.24 um 08:37 schrieb Alexander Gordeev:
> 
>>> We could piggy-back on vm_fault_to_errno(). We could use
>>> vm_fault_to_errno(rc, FOLL_HWPOISON), and only continue (retry) if the rc is 0 or
>>> -EFAULT, otherwise fail with the returned error.
>>>
>>> But I'd do that as a follow up, and also use it in break_ksm() in the same fashion.
>>
>> @Christian, do you agree with this suggestion?
> 
> I would need to look into that more closely to give a proper answer. In general I am ok
> with this but I prefer to have more eyes on that.
>   From what I can tell we should cover all the normal cases with our CI as soon as it hits
> next. But maybe we should try to create/change a selftest to trigger these error cases?

If we find a shared zeropage we expect the next unsharing fault to 
succeed except:

(1) OOM, in which case we translate to -ENOMEM.

(2) Some obscure race with MADV_DONTNEED paired with concurrent 
truncate(), in which case we get an error, but if we look again, we will 
find the shared zeropage no longer mapped. (this is what break_ksm() 
describes)

(3) MCE while copying the page, which doesn't quite apply here.

For the time being, we only get shared zeropages in (a) anon mappings 
(b) MAP_PRIVATE shmem mappings via UFFDIO_ZEROPAGE. So (2) is hard or 
even impossible to trigger. (1) is hard to test as well, and (3) ...

No easy way to extend selftests that I can see.

If we repeatedly find a shared zeropage in a COW mapping and get an 
error from the unsharing fault, something else would be deeply flawed. 
So I'm not really worried about that, but I agree that having a more 
centralized check will make sense.

-- 
Cheers,

David / dhildenb