lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <21b1e325-a17d-c859-973d-de66c1401f19@intel.com>
Date:   Fri, 5 Feb 2021 10:58:33 -0800
From:   "Yu, Yu-cheng" <yu-cheng.yu@...el.com>
To:     Kees Cook <keescook@...omium.org>
Cc:     x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-mm@...ck.org,
        linux-arch@...r.kernel.org, linux-api@...r.kernel.org,
        Arnd Bergmann <arnd@...db.de>,
        Andy Lutomirski <luto@...nel.org>,
        Balbir Singh <bsingharora@...il.com>,
        Borislav Petkov <bp@...en8.de>,
        Cyrill Gorcunov <gorcunov@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Eugene Syromiatnikov <esyr@...hat.com>,
        Florian Weimer <fweimer@...hat.com>,
        "H.J. Lu" <hjl.tools@...il.com>, Jann Horn <jannh@...gle.com>,
        Jonathan Corbet <corbet@....net>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Nadav Amit <nadav.amit@...il.com>,
        Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
        Peter Zijlstra <peterz@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        "Ravi V. Shankar" <ravi.v.shankar@...el.com>,
        Vedvyas Shanbhogue <vedvyas.shanbhogue@...el.com>,
        Dave Martin <Dave.Martin@....com>,
        Weijiang Yang <weijiang.yang@...el.com>,
        Pengfei Xu <pengfei.xu@...el.com>
Subject: Re: [PATCH v19 08/25] x86/mm: Introduce _PAGE_COW

On 2/4/2021 12:19 PM, Kees Cook wrote:
> On Wed, Feb 03, 2021 at 02:55:30PM -0800, Yu-cheng Yu wrote:
>> There is essentially no room left in the x86 hardware PTEs on some OSes
>> (not Linux).  That left the hardware architects looking for a way to
>> represent a new memory type (shadow stack) within the existing bits.
>> They chose to repurpose a lightly-used state: Write=0, Dirty=1.
>>
>> The reason it's lightly used is that Dirty=1 is normally set by hardware
>> and cannot normally be set by hardware on a Write=0 PTE.  Software must
>> normally be involved to create one of these PTEs, so software can simply
>> opt to not create them.
>>
>> In places where Linux normally creates Write=0, Dirty=1, it can use the
>> software-defined _PAGE_COW in place of the hardware _PAGE_DIRTY.  In other
>> words, whenever Linux needs to create Write=0, Dirty=1, it instead creates
>> Write=0, Cow=1, except for shadow stack, which is Write=0, Dirty=1.  This
>> clearly separates shadow stack from other data, and results in the
>> following:
>>
>> (a) A modified, copy-on-write (COW) page: (Write=0, Cow=1)
>> (b) A R/O page that has been COW'ed: (Write=0, Cow=1)
>>      The user page is in a R/O VMA, and get_user_pages() needs a writable
>>      copy.  The page fault handler creates a copy of the page and sets
>>      the new copy's PTE as Write=0 and Cow=1.
>> (c) A shadow stack PTE: (Write=0, Dirty=1)
>> (d) A shared shadow stack PTE: (Write=0, Cow=1)
>>      When a shadow stack page is being shared among processes (this happens
>>      at fork()), its PTE is made Dirty=0, so the next shadow stack access
>>      causes a fault, and the page is duplicated and Dirty=1 is set again.
>>      This is the COW equivalent for shadow stack pages, even though it's
>>      copy-on-access rather than copy-on-write.
>> (e) A page where the processor observed a Write=1 PTE, started a write, set
>>      Dirty=1, but then observed a Write=0 PTE.  That's possible today, but
>>      will not happen on processors that support shadow stack.
> 
> What happens for "e" with/without CET? It sounds like direct writes to
> such pages will be (correctly) rejected by the MMU?
> 
>>
>> Define _PAGE_COW and update pte_*() helpers and apply the same changes to
>> pmd and pud.
>>
>> After this, there are six free bits left in the 64-bit PTE, and no more
>> free bits in the 32-bit PTE (except for PAE) and Shadow Stack is not
>> implemented for the 32-bit kernel.
> 
> Are there selftests to validate this change?
> 

I have some tests to verify, for example,

- After clone(), shadow stack pages are indeed copy-on-write,
- Shadow stack pages (i.e. Write=0, Dirty=1) cannot be directly written to,
- Shadow stack guard pages exist.

These tests are now on github, but kind of messy.  I can gradually clean 
up them and submit as selftests separately.

If you are asking for the detection of the potential hardware issue 
(that Dave Hansen talked about), then maybe we need to detect it from 
the kernel.

> I think it might be useful to more clearly describe what is considered
> "dirty" and "writeable" in comments above the pte_helpers.
> 

Yes, I will update it.  Thanks!

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ