linux-kernel - Re: [PATCH v2 1/1] mm: do not increment pgfault stats when page fault handler retries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZD6/805XpvfZde0Y@x1n>
Date:   Tue, 18 Apr 2023 12:06:11 -0400
From:   Peter Xu <peterx@...hat.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Suren Baghdasaryan <surenb@...gle.com>, akpm@...ux-foundation.org,
        hannes@...xchg.org, mhocko@...e.com, josef@...icpanda.com,
        jack@...e.cz, ldufour@...ux.ibm.com, laurent.dufour@...ibm.com,
        michel@...pinasse.org, liam.howlett@...cle.com, jglisse@...gle.com,
        vbabka@...e.cz, minchan@...gle.com, dave@...olabs.net,
        punit.agrawal@...edance.com, lstoakes@...il.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        kernel-team@...roid.com
Subject: Re: [PATCH v2 1/1] mm: do not increment pgfault stats when page
 fault handler retries

On Tue, Apr 18, 2023 at 04:08:58PM +0100, Matthew Wilcox wrote:
> On Tue, Apr 18, 2023 at 07:54:01AM -0700, Suren Baghdasaryan wrote:
> > On Tue, Apr 18, 2023 at 7:25 AM Matthew Wilcox <willy@...radead.org> wrote:
> > >
> > > On Mon, Apr 17, 2023 at 04:17:45PM -0700, Suren Baghdasaryan wrote:
> > > > On Mon, Apr 17, 2023 at 3:52 PM Matthew Wilcox <willy@...radead.org> wrote:
> > > > >
> > > > > On Mon, Apr 17, 2023 at 03:40:33PM -0400, Peter Xu wrote:
> > > > > > >     /*
> > > > > > > -    * We don't do accounting for some specific faults:
> > > > > > > -    *
> > > > > > > -    * - Unsuccessful faults (e.g. when the address wasn't valid).  That
> > > > > > > -    *   includes arch_vma_access_permitted() failing before reaching here.
> > > > > > > -    *   So this is not a "this many hardware page faults" counter.  We
> > > > > > > -    *   should use the hw profiling for that.
> > > > > > > -    *
> > > > > > > -    * - Incomplete faults (VM_FAULT_RETRY).  They will only be counted
> > > > > > > -    *   once they're completed.
> > > > > > > +    * Do not account for incomplete faults (VM_FAULT_RETRY). They will be
> > > > > > > +    * counted upon completion.
> > > > > > >      */
> > > > > > > -   if (ret & (VM_FAULT_ERROR | VM_FAULT_RETRY))
> > > > > > > +   if (ret & VM_FAULT_RETRY)
> > > > > > > +           return;
> > > > > > > +
> > > > > > > +   /* Register both successful and failed faults in PGFAULT counters. */
> > > > > > > +   count_vm_event(PGFAULT);
> > > > > > > +   count_memcg_event_mm(mm, PGFAULT);
> > > > > >
> > > > > > Is there reason on why vm events accountings need to be explicitly
> > > > > > different from perf events right below on handling ERROR?
> > > > >
> > > > > I think so.  ERROR is quite different from RETRY.  If we are, for
> > > > > example, handling a SIGSEGV (perhaps a GC language?) that should be
> > > > > accounted.  If we can't handle a page fault right now, and need to
> > > > > retry within the kernel, that should not be accounted.
> > > >
> > > > IIUC, the question was about the differences in vm vs perf accounting
> > > > for errors, not the difference between ERROR and RETRY cases. Matthew,
> > > > are you answering the right question or did I misunderstand your
> > > > answer?
> > >
> > > Maybe I'm misunderstanding what you're proposing.  I thought the
> > > proposal was to make neither ERROR nor RETRY increment the counters,
> > > but if the proposal is to make ERROR increment the perf counters
> > > instead, then that's cool with me.
> > 
> > Oh, I think now I understand your answer. You were not highlighting
> > the difference between the who but objecting to the proposal of not
> > counting both ERROR and RETRY. Am I on the same page now?
> 
> I think so.  Let's see your patch and then we can be sure we're talking
> about the same thing ;-)

IMHO if there's no explicit reason to differenciate the events, we should
always account them the same way for vm,perf,... either with ERROR
accounted or not.

I am not sure whether accounting ERROR faults would matter for a mprotect()
use case, because they shouldn't rely on the counter to work but the SIGBUS
itself to be generated on page accesses with the sighandler doing work.

One trivial benefit of keep accounting ERROR is we only need to modify vm
account ABI so both RETRY & ERROR will be adjusted to match perf,task
counters.  OTOH we can also change all to take ERROR into account, but then
we're modifying ABI for all counters.

-- 
Peter Xu