lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 11 Mar 2024 16:41:28 -0700
From: "Andy Lutomirski" <luto@...nel.org>
To: "Dave Hansen" <dave.hansen@...el.com>,
 "Pasha Tatashin" <pasha.tatashin@...een.com>,
 "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>,
 linux-mm@...ck.org, "Andrew Morton" <akpm@...ux-foundation.org>,
 "the arch/x86 maintainers" <x86@...nel.org>,
 "Borislav Petkov" <bp@...en8.de>,
 "Christian Brauner" <brauner@...nel.org>, bristot@...hat.com,
 "Ben Segall" <bsegall@...gle.com>,
 "Dave Hansen" <dave.hansen@...ux.intel.com>, dianders@...omium.org,
 dietmar.eggemann@....com, eric.devolder@...cle.com, hca@...ux.ibm.com,
 "hch@...radead.org" <hch@...radead.org>,
 "H. Peter Anvin" <hpa@...or.com>,
 "Jacob Pan" <jacob.jun.pan@...ux.intel.com>,
 "Jason Gunthorpe" <jgg@...pe.ca>, jpoimboe@...nel.org,
 "Joerg Roedel" <jroedel@...e.de>, juri.lelli@...hat.com,
 "Kent Overstreet" <kent.overstreet@...ux.dev>, kinseyho@...gle.com,
 "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
 lstoakes@...il.com, mgorman@...e.de, mic@...ikod.net,
 michael.christie@...cle.com, "Ingo Molnar" <mingo@...hat.com>,
 mjguzik@...il.com, "Michael S. Tsirkin" <mst@...hat.com>,
 "Nicholas Piggin" <npiggin@...il.com>,
 "Peter Zijlstra (Intel)" <peterz@...radead.org>,
 "Petr Mladek" <pmladek@...e.com>,
 "Rick P Edgecombe" <rick.p.edgecombe@...el.com>,
 "Steven Rostedt" <rostedt@...dmis.org>,
 "Suren Baghdasaryan" <surenb@...gle.com>,
 "Thomas Gleixner" <tglx@...utronix.de>,
 "Uladzislau Rezki" <urezki@...il.com>, vincent.guittot@...aro.org,
 vschneid@...hat.com
Subject: Re: [RFC 11/14] x86: add support for Dynamic Kernel Stacks

On Mon, Mar 11, 2024, at 4:34 PM, Dave Hansen wrote:
> On 3/11/24 15:17, Andy Lutomirski wrote:
>> I *think* that all x86 implementations won't fill the TLB for a
>> non-accessed page without also setting the accessed bit,
>
> That's my understanding as well.  The SDM is a little more obtuse about it:
>
>> Whenever the processor uses a paging-structure entry as part of
>> linear-address translation, it sets the accessed flag in that entry
>> (if it is not already set).
>
> but it's there.
>
> But if we start needing Accessed=1 to be accurate, clearing those PTEs
> gets more expensive because it needs to be atomic to lock out the page
> walker.  It basically needs to start getting treated similarly to what
> is done for Dirty=1 on userspace PTEs.  Not the end of the world, of
> course, but one more source of overhead.

In my fantasy land where I understand the x86 paging machinery, suppose we're in finish_task_switch(), and suppose prev is Not Horribly Buggy (TM).  In particular, suppose that no other CPU is concurrently (non-speculatively!) accessing prev's stack.  Prev can't be running, because whatever magic lock prevents it from being migrated hasn't been released yet.  (I have no idea what lock this is, but it had darned well better exist so prev isn't migrated before switch_to() even returns.)

So the current CPU is not accessing the memory, and no other CPU is accessing the memory, and BPF doesn't exist, so no one is being utterly daft and a kernel read probe, and perf isn't up to any funny business, etc.  And a CPU will never *speculatively* set the accessed bit (I told you it's fantasy land), so we just do it unlocked:

if (!pte->accessed) {
  *pte = 0;
  reuse the memory;
}

What could possibly go wrong?

I admit this is not the best idea I've ever had, and I will not waste anyone's time by trying very hard to defend it :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ