[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <aadc7b03-9121-6800-341b-6f2849088a09@linux.ibm.com>
Date: Wed, 24 Apr 2019 20:01:20 +0200
From: Laurent Dufour <ldufour@...ux.ibm.com>
To: Michel Lespinasse <walken@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Andi Kleen <ak@...ux.intel.com>, dave@...olabs.net,
Jan Kara <jack@...e.cz>, Matthew Wilcox <willy@...radead.org>,
aneesh.kumar@...ux.ibm.com,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
mpe@...erman.id.au, Paul Mackerras <paulus@...ba.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Will Deacon <will.deacon@....com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
sergey.senozhatsky.work@...il.com,
Andrea Arcangeli <aarcange@...hat.com>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
kemi.wang@...el.com, Daniel Jordan <daniel.m.jordan@...cle.com>,
David Rientjes <rientjes@...gle.com>,
Jerome Glisse <jglisse@...hat.com>,
Ganesh Mahendran <opensource.ganesh@...il.com>,
Minchan Kim <minchan@...nel.org>,
Punit Agrawal <punitagrawal@...il.com>,
vinayak menon <vinayakm.list@...il.com>,
Yang Shi <yang.shi@...ux.alibaba.com>,
zhong jiang <zhongjiang@...wei.com>,
Haiyan Song <haiyanx.song@...el.com>,
Balbir Singh <bsingharora@...il.com>, sj38.park@...il.com,
Mike Rapoport <rppt@...ux.ibm.com>,
LKML <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, haren@...ux.vnet.ibm.com,
Nick Piggin <npiggin@...il.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
linuxppc-dev@...ts.ozlabs.org, x86@...nel.org
Subject: Re: [PATCH v12 00/31] Speculative page faults
Le 22/04/2019 à 23:29, Michel Lespinasse a écrit :
> Hi Laurent,
>
> Thanks a lot for copying me on this patchset. It took me a few days to
> go through it - I had not been following the previous iterations of
> this series so I had to catch up. I will be sending comments for
> individual commits, but before tat I would like to discuss the series
> as a whole.
Hi Michel,
Thanks for reviewing this series.
> I think these changes are a big step in the right direction. My main
> reservation about them is that they are additive - adding some complexity
> for speculative page faults - and I wonder if it'd be possible, over the
> long term, to replace the existing complexity we have in mmap_sem retry
> mechanisms instead of adding to it. This is not something that should
> block your progress, but I think it would be good, as we introduce spf,
> to evaluate whether we could eventually get all the way to removing the
> mmap_sem retry mechanism, or if we will actually have to keep both.
Until we get rid of the mmap_sem which seems to be a very long story, I
can't see how we could get rid of the retry mechanism.
> The proposed spf mechanism only handles anon vmas. Is there a
> fundamental reason why it couldn't handle mapped files too ?
> My understanding is that the mechanism of verifying the vma after
> taking back the ptl at the end of the fault would work there too ?
> The file has to stay referenced during the fault, but holding the vma's
> refcount could be made to cover that ? the vm_file refcount would have
> to be released in __free_vma() instead of remove_vma; I'm not quite sure
> if that has more implications than I realize ?
The only concern is the flow of operation done in the vm_ops->fault()
processing. Most of the file system relie on the generic filemap_fault()
which should be safe to use. But we need a clever way to identify fault
processing which are compatible with the SPF handler. This could be done
using a tag/flag in the vm_ops structure or in the vma's flags.
This would be the next step.
> The proposed spf mechanism only works at the pte level after the page
> tables have already been created. The non-spf page fault path takes the
> mm->page_table_lock to protect against concurrent page table allocation
> by multiple page faults; I think unmapping/freeing page tables could
> be done under mm->page_table_lock too so that spf could implement
> allocating new page tables by verifying the vma after taking the
> mm->page_table_lock ?
I've to admit that I didn't dig further here.
Do you have a patch? ;)
>
> The proposed spf mechanism depends on ARCH_HAS_PTE_SPECIAL.
> I am not sure what is the issue there - is this due to the vma->vm_start
> and vma->vm_pgoff reads in *__vm_normal_page() ?
Yes that's the reason, no way to guarantee the value of these fields in
the SPF path.
>
> My last potential concern is about performance. The numbers you have
> look great, but I worry about potential regressions in PF performance
> for threaded processes that don't currently encounter contention
> (i.e. there may be just one thread actually doing all the work while
> the others are blocked). I think one good proxy for measuring that
> would be to measure a single threaded workload - kernbench would be
> fine - without the special-case optimization in patch 22 where
> handle_speculative_fault() immediately aborts in the single-threaded case.
I'll have to give it a try.
> Reviewed-by: Michel Lespinasse <walken@...gle.com>
> This is for the series as a whole; I expect to do another review pass on
> individual commits in the series when we have agreement on the toplevel
> stuff (I noticed a few things like out-of-date commit messages but that's
> really minor stuff).
Thanks a lot for reviewing this long series.
>
> I want to add a note about mmap_sem. In the past there has been
> discussions about replacing it with an interval lock, but these never
> went anywhere because, mostly, of the fact that such mechanisms were
> too expensive to use in the page fault path. I think adding the spf
> mechanism would invite us to revisit this issue - interval locks may
> be a great way to avoid blocking between unrelated mmap_sem writers
> (for example, do not delay stack creation for new threads while a
> large mmap or munmap may be going on), and probably also to handle
> mmap_sem readers that can't easily use the spf mechanism (for example,
> gup callers which make use of the returned vmas). But again that is a
> separate topic to explore which doesn't have to get resolved before
> spf goes in.
>
Powered by blists - more mailing lists