[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZTZuui+0Ppe6cjgC@casper.infradead.org>
Date: Mon, 23 Oct 2023 14:01:46 +0100
From: Matthew Wilcox <willy@...radead.org>
To: zhiguojiang <justinjiang@...o.com>
Cc: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, opensource.kernel@...o.com
Subject: Re: [PATCH v2 1/2] mm:vmscan: the dirty folio in folio_list skip
unmap
On Mon, Oct 23, 2023 at 08:44:55PM +0800, zhiguojiang wrote:
> 在 2023/10/23 20:21, Matthew Wilcox 写道:
> > On Mon, Oct 23, 2023 at 04:07:28PM +0800, zhiguojiang wrote:
> > > > Are you seeing measurable changes for any workloads? It certainly seems
> > > > like you should, but it would help if you chose a test from mmtests and
> > > > showed how performance changed on your system.
> > > In one mmtest, the max times for a invalid recyling of a folio_list dirty
> > > folio that does not support pageout and has been activated in
> > > shrink_folio_list() are: cost=51us, exe=2365us.
> > >
> > > Calculate according to this formula: dirty_cost / total_cost * 100%, the
> > > recyling efficiency of dirty folios can be improved 53.13%、82.95%.
> > >
> > > So this patch can optimize shrink efficiency and reduce the workload of
> > > kswapd to a certain extent.
> > >
> > > kswapd0-96 ( 96) [005] ..... 387.218548:
> > > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32
> > > nr_reclaimed 31 nr_dirty 1 nr_unqueued_dirty 1 nr_writeback 0
> > > nr_activate[1] 1 nr_ref_keep 0 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
> > > total_cost 96 total_exe 2365 dirty_cost 51 total_exe 2365
> > >
> > > kswapd0-96 ( 96) [006] ..... 412.822532:
> > > mm_vmscan_lru_shrink_inactive: [Justin] nid 0 nr_scanned 32 nr_taken 32
> > > nr_reclaimed 0 nr_dirty 32 nr_unqueued_dirty 32 nr_writeback 0
> > > nr_activate[1] 19 nr_ref_keep 13 f RECLAIM_WB_FILE|RECLAIM_WB_ASYNC
> > > total_cost 88 total_exe 605 dirty_cost 73 total_exe 605
> > I appreciate that you can put probes in and determine the cost, but do
> > you see improvements for a real workload? Like doing a kernel compile
> > -- does it speed up at all?
> Can you help share a method for testing thread workload, like kswapd?
Something dirt simple like 'time make -j8'.
Powered by blists - more mailing lists