[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0704161139020.12097@chino.kir.corp.google.com>
Date: Mon, 16 Apr 2007 11:51:53 -0700 (PDT)
From: David Rientjes <rientjes@...gle.com>
To: Hugh Dickins <hugh@...itas.com>
cc: Zachary Amsden <zach@...are.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org
Subject: Re: [patch -mm] i386: use pte_update_defer in ptep_test_and_clear_{dirty,young}
On Mon, 16 Apr 2007, Hugh Dickins wrote:
> When, as now, core mm people make some change, failing to take your
> case into account.
>
That's exactly what happened here: include/asm-i386/pgtable.h is
advertising both __HAVE_ARCH_PTEP_TEST_AND_CLEAR_{DIRTY,YOUNG} without
actually having them. No consideration is going to be given to such a
hack when clearing A/D bits in generic code. If the optimization for a
specific architecture is not possible for an atomic test and clear of A/D
bits generically, then it's not an optimization at all. It's a bug.
> You're right to want to defer your pte_updates, David is right to want
> to batch his TLB flushes. It bothers me that you have a surprising case,
> and that unless you abandon your optimization, it imposes a new constraint
> on how to proceed in common code (without #ifdef'ing around).
>
Batching the TLB flushes was really a no brainer: there was a significant
speed-up in /proc/pid/clear_refs by using flush_tlb_mm() over
flush_tlb_page(). If there's an "optimization" for paravirt that gets in
the way by defering the updated A/D bit, then it needs to be eliminated.
It was merged because there were no generic uses of
ptep_test_and_clear_{dirty,young}, but now there are.
If it was really considered to be an "optimization," which is was
advertised to be in the changelog associated with a very trivial patch,
then removing it by definition would not introduce new constraints that
would need to be patched. I'd like to see a patch without the defer and
see how invasive it truly is on generic code because it was not originally
merged that invasively.
> Compromise patch below: would that be satisfactory to you, David?
>
I really like the patch, but for perhaps a slightly different reason:
we're only flushing ranges that have been shown to need it. We aren't
completely flushing the entire mm which is likely to be excessive in
situations where we're actually using /proc/pid/clear_refs in combination
with /proc/pid/smaps for memory footprint approximation (i.e. it's on a
fine granularity).
David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists