linux-kernel - Re: [PATCH v2 0/4] mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250408202335.63434-1-sj@kernel.org>
Date: Tue,  8 Apr 2025 13:23:35 -0700
From: SeongJae Park <sj@...nel.org>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: SeongJae Park <sj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Liam R.Howlett" <howlett@...il.com>,
	David Hildenbrand <david@...hat.com>,
	Rik van Riel <riel@...riel.com>,
	Shakeel Butt <shakeel.butt@...ux.dev>,
	Vlastimil Babka <vbabka@...e.cz>,
	kernel-team@...a.com,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: [PATCH v2 0/4] mm/madvise: batch tlb flushes for MADV_DONTNEED and MADV_FREE

On Tue, 8 Apr 2025 14:44:40 +0100 Lorenzo Stoakes <lorenzo.stoakes@...cle.com> wrote:

> On Fri, Apr 04, 2025 at 02:06:56PM -0700, SeongJae Park wrote:
> > When process_madvise() is called to do MADV_DONTNEED[_LOCKED] or
> > MADV_FREE with multiple address ranges, tlb flushes happen for each of
> > the given address ranges.  Because such tlb flushes are for same
> 
> Nit: for _the_ same.

Thank you for kindly finding and suggesting fixes for these mistakes.  I will
update following your suggestions here and below.

[...]
> > Similar optimizations might be applicable to other madvise behaviros
> 
> Typo: behaviros -> behavior (or 'behaviors', but since behavior is already plural
> probably unnecessary).
> 
> > such as MADV_COLD and MADV_PAGEOUT.  Those are simply out of the scope
> > of this patch series, though.
> 
> Well well, for now :)

Yes.  Hopefully we will have another chance to further improve the cases.

[...]
> > Test Results
> > ============
> >
> > I measured the latency to apply MADV_DONTNEED advice to 256 MiB memory
> > using multiple process_madvise() calls.  I apply the advice in 4 KiB
> > sized regions granularity, but with varying batch size per
> > process_madvise() call (vlen) from 1 to 1024.  The source code for the
> > measurement is available at GitHub[1].  To reduce measurement errors, I
> > did the measurement five times.
> 
> Be interesting to see how this behaves with mTHP sizing too! But probably a bit
> out of scope perhaps.

Obviously we have many more rooms to explore and get fun :)

> 
> >
> > The measurement results are as below.  'sz_batch' column shows the batch
> > size of process_madvise() calls.  'Before' and 'After' columns show the
> > average of latencies in nanoseconds that measured five times on kernels
> > that built without and with the tlb flushes batching of this series
> > (patches 3 and 4), respectively.  For the baseline, mm-new tree of
> > 2025-04-04[2] has been used.  'B-stdev' and 'A-stdev' columns show
> > ratios of latency measurements standard deviation to average in percent
> > for 'Before' and 'After', respectively.  'Latency_reduction' shows the
> > reduction of the latency that the 'After' has achieved compared to
> > 'Before', in percent.  Higher 'Latency_reduction' values mean more
> > efficiency improvements.
> >
> >     sz_batch   Before        B-stdev   After         A-stdev   Latency_reduction
> >     1          110948138.2   5.55      109476402.8   4.28      1.33
> >     2          75678535.6    1.67      70470722.2    3.55      6.88
> >     4          59530647.6    4.77      51735606.6    3.44      13.09
> >     8          50013051.6    4.39      44377029.8    5.20      11.27
> >     16         48657878.2    9.32      37291600.4    3.39      23.36
> >     32         43614180.2    6.06      34127428      3.75      21.75
> >     64         42466694.2    5.70      26737935.2    2.54      37.04
> >     128        42977970      6.99      25050444.2    4.06      41.71
> >     256        41549546      1.88      24644375.8    3.77      40.69
> >     512        42162698.6    6.17      24068224.8    2.87      42.92
> >     1024       40978574      5.44      23872024.2    3.65      41.75
> 
> Very nice! Great work.
> 
> >
> > As expected, tlb flushes batching provides latency reduction that
> > proportional to the batch size.  The efficiency gain ranges from about
> > 6.88 percent with batch size 2, to about 40 percent with batch size 128.
> >
> > Please note that this is a very simple microbenchmark, so real
> > efficiency gain on real workload could be very different.
> 
> Indeed, accepted, but it makes a great deal of sense to batch these operations,
> especially when we get to the point of actually increasing the process_madvise()
> iov size.

Cannot agree more.

Thank you for your kind review with great suggestions for this patchset.  I
will post the next spin with the suggested changes, soon.


Thanks,
SJ

[...]