lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94dabe57-232b-4a21-b2cf-bcfbda75c881@lucifer.local>
Date: Fri, 29 Nov 2024 13:19:14 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH] perf: map pages in advance

On Fri, Nov 29, 2024 at 02:12:23PM +0100, David Hildenbrand wrote:
> On 29.11.24 14:02, Lorenzo Stoakes wrote:
> > On Fri, Nov 29, 2024 at 01:59:01PM +0100, David Hildenbrand wrote:
> > > On 29.11.24 13:55, Lorenzo Stoakes wrote:
> > > > On Fri, Nov 29, 2024 at 01:45:42PM +0100, David Hildenbrand wrote:
> > > > > On 29.11.24 13:26, Peter Zijlstra wrote:
> > > > > > On Fri, Nov 29, 2024 at 01:12:57PM +0100, David Hildenbrand wrote:
> > > > > >
> > > > > > > Well, I think we simply will want vm_insert_pages_prot() that stops treating
> > > > > > > these things like folios :) . *likely*  we'd want a distinct memdesc/type.
> > > > > > >
> > > > > > > We could start that work right now by making some user (iouring,
> > > > > > > ring_buffer) set a new page->_type, and checking that in
> > > > > > > vm_insert_pages_prot() + vm_normal_page(). If set, don't touch the refcount
> > > > > > > and the mapcount.
> > > > > > >
> > > > > > > Because then, we can just make all the relevant drivers set the type, refuse
> > > > > > > in vm_insert_pages_prot() anything that doesn't have the type set, and
> > > > > > > refuse in vm_normal_page() any pages with this memdesc.
> > > > > > >
> > > > > > > Maybe we'd have to teach CoW to copy from such pages, maybe not. GUP of
> > > > > > > these things will stop working, I hope that is not a problem.
> > > > > >
> > > > > > Well... perf-tool likes to call write() upon these pages in order to
> > > > > > write out the data from the mmap() into a file.
> > > >
> > > > I'm confused about what you mean, write() using the fd should work fine, how
> > > > would they interact with the mmap? I mean be making a silly mistake here
> > >
> > > write() to file from the mmap()'ed address range to *some* file.
> > >
> >
> > Yeah sorry my brain melted down briefly, for some reason was thinking of read()
> > writing into the buffer...
> >
> > > This will GUP the pages you inserted.
> > >
> > > GUP does not work on PFNMAP.
> >
> > Well it _does_ if struct page **pages is set to NULL :)
>
> Hm? :)
>
> check_vma_flags() unconditionally refuses VM_PFNMAP.

Ha, funny with my name all over git blame there... ok yup missed this, the
vm_normal_page() == NULL stuff must but for mixed map (and those other weird
cases I think you can get0...

Well good. Where is write() invoking GUP? I'm kind of surprised it's not just
using uaccess?

One thing to note is I did run all the perf tests with no issues whatsoever. You
would _think_ this would have come up...

I'm editing some test code to explicitly write() from the buffer anyway to see.

If we can't do pfnmap, and we definitely can't do mixedmap (because it's
basically entirely equivalent in every way to just faulting in the pages as
before and requires the same hacks) then I will have to go back to the drawing
board or somehow change the faulting code.

This really sucks.

I'm not quite sure I even understand why we don't allow GUP used _just for
pinning_ on VM_PFNMAP when it is -in effect- already pinned on assumption
whatever mapped it will maintain the lifetime.

What a mess...

>
> --
> Cheers,
>
> David / dhildenb
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ