lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54cdad9f-b810-7966-5928-9320d970a43d@citrix.com>
Date:   Wed, 5 Oct 2022 02:17:32 +0000
From:   Andrew Cooper <Andrew.Cooper3@...rix.com>
To:     Rick Edgecombe <rick.p.edgecombe@...el.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H . Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        Arnd Bergmann <arnd@...db.de>,
        Andy Lutomirski <luto@...nel.org>,
        Balbir Singh <bsingharora@...il.com>,
        Borislav Petkov <bp@...en8.de>,
        Cyrill Gorcunov <gorcunov@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Eugene Syromiatnikov <esyr@...hat.com>,
        Florian Weimer <fweimer@...hat.com>,
        "H . J . Lu" <hjl.tools@...il.com>, Jann Horn <jannh@...gle.com>,
        Jonathan Corbet <corbet@....net>,
        Kees Cook <keescook@...omium.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Nadav Amit <nadav.amit@...il.com>,
        Oleg Nesterov <oleg@...hat.com>, Pavel Machek <pavel@....cz>,
        Peter Zijlstra <peterz@...radead.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        "Ravi V . Shankar" <ravi.v.shankar@...el.com>,
        Weijiang Yang <weijiang.yang@...el.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        "joao.moreira@...el.com" <joao.moreira@...el.com>,
        John Allen <john.allen@....com>,
        "kcc@...gle.com" <kcc@...gle.com>,
        "eranian@...gle.com" <eranian@...gle.com>,
        "rppt@...nel.org" <rppt@...nel.org>,
        "jamorris@...ux.microsoft.com" <jamorris@...ux.microsoft.com>,
        "dethoma@...rosoft.com" <dethoma@...rosoft.com>,
        Andrew Cooper <Andrew.Cooper3@...rix.com>
CC:     Yu-cheng Yu <yu-cheng.yu@...el.com>
Subject: Re: [PATCH v2 10/39] x86/mm: Introduce _PAGE_COW

On 29/09/2022 23:29, Rick Edgecombe wrote:
> From: Yu-cheng Yu <yu-cheng.yu@...el.com>
>
> There is essentially no room left in the x86 hardware PTEs on some OSes
> (not Linux). That left the hardware architects looking for a way to
> represent a new memory type (shadow stack) within the existing bits.
> They chose to repurpose a lightly-used state: Write=0,Dirty=1.

How does "Some OSes have a greater dependence on software available bits
in PTEs than Linux" sound?

> The reason it's lightly used is that Dirty=1 is normally set _before_ a
> write. A write with a Write=0 PTE would typically only generate a fault,
> not set Dirty=1. Hardware can (rarely) both set Write=1 *and* generate the
> fault, resulting in a Dirty=0,Write=1 PTE. Hardware which supports shadow
> stacks will no longer exhibit this oddity.

Again, an interesting anecdote but not salient information here.

> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 6496ec84b953..ad201dae7316 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -134,9 +142,17 @@ static inline int pte_young(pte_t pte)
>  	return pte_flags(pte) & _PAGE_ACCESSED;
>  }
>  
> -static inline int pmd_dirty(pmd_t pmd)
> +static inline bool pmd_dirty(pmd_t pmd)
>  {
> -	return pmd_flags(pmd) & _PAGE_DIRTY;
> +	return pmd_flags(pmd) & _PAGE_DIRTY_BITS;
> +}
> +
> +static inline bool pmd_shstk(pmd_t pmd)
> +{
> +	if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> +		return false;
> +
> +	return (pmd_flags(pmd) & (_PAGE_RW | _PAGE_DIRTY)) == _PAGE_DIRTY;

(flags & PSE|RW|D) == PSE|D;

R/O+D can exist higher in the paging structures and does not convey
type=shstk-ness to later stages of the walk.


However, there is a further complication which is bound rear its head
sooner or later, and warrants discussing.

type=shstk isn't actually only R/O+D on the leaf PTE; its also R/W on
the accumulated access rights on non-leaf PTEs.

Specifically, if you clear the RW bit on any higher level in the
pagetable, then everything mapped by that PTE ceases to be of type
shstk, even if the leaf has the R/O+D bit combination.

This is allegedly a feature for the database folks, where they can
create R/O and R/W aliases of the same memory, sharing intermediate
pagetables, where the R/W alias will set D bits per usual and the R/O
alias needs not to transmogrify itself into a shadow stack.

~Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ