linux-kernel - Re: [RFC v5 00/38] powerpc: Memory Protection Keys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1499900032.2865.46.camel@kernel.crashing.org>
Date:   Thu, 13 Jul 2017 08:53:52 +1000
From:   Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:     Michal Hocko <mhocko@...nel.org>, Ram Pai <linuxram@...ibm.com>
Cc:     linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
        linux-doc@...r.kernel.org, linux-kselftest@...r.kernel.org,
        paulus@...ba.org, mpe@...erman.id.au, khandual@...ux.vnet.ibm.com,
        aneesh.kumar@...ux.vnet.ibm.com, bsingharora@...il.com,
        dave.hansen@...el.com, hbabu@...ibm.com, arnd@...db.de,
        akpm@...ux-foundation.org, corbet@....net, mingo@...hat.com
Subject: Re: [RFC v5 00/38] powerpc: Memory Protection Keys

On Wed, 2017-07-12 at 09:23 +0200, Michal Hocko wrote:
> 
> > 
> > Ideally the MMU looks at the PTE for keys, in order to enforce
> > protection. This is the case with x86 and is the case with power9 Radix
> > page table. Hence the keys have to be programmed into the PTE.
> 
> But x86 doesn't update ptes for PKEYs, that would be just too expensive.
> You could use standard mprotect to do the same...

What do you mean ? x86 ends up in mprotect_fixup -> change_protection()
which will update the PTEs just the same as we do.

Changing the key for a page is a form mprotect. Changing the access
permissions for keys is different, for us it's a special register
(AMR).

I don't understand why you think we are doing any differently than x86
here.

> > However with HPT on power, these keys do not necessarily have to be
> > programmed into the PTE. We could bypass the Linux Page Table Entry(PTE)
> > and instead just program them into the Hash Page Table(HPTE), since
> > the MMU does not refer the PTE but refers the HPTE. The last version
> > of the page attempted to do that.   It worked as follows:
> > 
> > a) when a address range is requested to be associated with a key; by the
> >    application through key_mprotect() system call, the kernel
> >    stores that key in the vmas corresponding to that address
> >    range.
> > 
> > b) Whenever there is a hash page fault for that address, the fault
> >    handler reads the key from the VMA and programs the key into the
> >    HPTE. __hash_page() is the function that does that.
> 
> What causes the fault here?

The hardware. With the hash MMU, the HW walks a hash table which is
effectively a large in-memory TLB extension. When a page isn't found
there, a  "hash fault" is generated allowing Linux to populate that
hash table with the content of the corresponding PTE. 

> > c) Once the hpte is programmed, the MMU can sense key violations and
> >    generate key-faults.
> > 
> > The problem is with step (b).  This step is really a very critical
> > path which is performance sensitive. We dont want to add any delays.
> > However if we want to access the key from the vma, we will have to
> > hold the vma semaphore, and that is a big NO-NO. As a result, this
> > design had to be dropped.
> > 
> > 
> > 
> > I reverted back to the old design i.e the design in v4 version. In this
> > version we do the following:
> > 
> > a) when a address range is requested to be associated with a key; by the
> >    application through key_mprotect() system call, the kernel
> >    stores that key in the vmas corresponding to that address
> >    range. Also the kernel programs the key into Linux PTE coresponding to all the
> >    pages associated with the address range.
> 
> OK, so how is this any different from the regular mprotect then?

It takes the key argument. This is nothing new. This was done for x86
already, we are just re-using the infrastructure. Look at
do_mprotect_pkey() in mm/mprotect.c today. It's all the same code,
pkey_mprotect() is just mprotect with an added key argument.

> > b) Whenever there is a hash page fault for that address, the fault
> >    handler reads the key from the Linux PTE and programs the key into 
> >    the HPTE.
> > 
> > c) Once the HPTE is programmed, the MMU can sense key violations and
> >    generate key-faults.
> > 
> > 
> > Since step (b) in this case has easy access to the Linux PTE, and hence
> > to the key, it is fast to access it and program the HPTE. Thus we avoid
> > taking any performance hit on this critical path.
> > 
> > Hope this explains the rationale,
> > 
> > 
> > As promised here is the high level design:
> 
> I will read through that later
> [...]