[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1d05ce2-1625-490f-ac5a-c88d3468385f@vivo.com>
Date: Fri, 20 Oct 2023 12:09:47 +0800
From: zhiguojiang <justinjiang@...o.com>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Cc: opensource.kernel@...o.com
Subject: Re: [PATCH v2 1/2] mm:vmscan: the dirty folio in folio_list skip
unmap
在 2023/10/20 11:59, zhiguojiang 写道:
>
>
> 在 2023/10/19 22:15, David Hildenbrand 写道:
>> [你通常不会收到来自 david@...hat.com 的电子邮件。请访问
>> https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]
>>
>> On 19.10.23 15:14, Zhiguo Jiang wrote:
>>> In the shrink_folio_list() the sources of the file dirty folio include
>>> two ways below:
>>> 1. The dirty folio is from the incoming parameter folio_list,
>>> which is the inactive file lru.
>>> 2. The dirty folio is from the PTE dirty bit transferred by
>>> the try_to_unmap().
>>>
>>> For the first source of the dirty folio, if the dirty folio does not
>>> support pageout, the dirty folio can skip unmap in advance to reduce
>>> recyling time.
>>>
>>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>>> ---
>>>
>>> Changelog:
>>> v1->v2:
>>> 1. Keep the original judgment flow.
>>> 2. Add the interface of folio_check_pageout().
>>> 3. The dirty folio which does not support pageout in inactive file lru
>>> skip unmap in advance.
>>>
>>> mm/vmscan.c | 103
>>> +++++++++++++++++++++++++++++++++-------------------
>>> 1 file changed, 66 insertions(+), 37 deletions(-)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a68d01fcc307..e067269275a5 100755
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -925,6 +925,44 @@ static void folio_check_dirty_writeback(struct
>>> folio *folio,
>>> mapping->a_ops->is_dirty_writeback(folio, dirty,
>>> writeback);
>>> }
>>>
>>> +/* Check if a dirty folio can support pageout in the recyling
>>> process*/
>>> +static bool folio_check_pageout(struct folio *folio,
>>> + struct pglist_data
>>> *pgdat)
>>> +{
>>> + int ret = true;
>>> +
>>> + /*
>>> + * Anonymous folios are not handled by flushers and must be
>>> written
>>> + * from reclaim context. Do not stall reclaim based on them.
>>> + * MADV_FREE anonymous folios are put into inactive file list
>>> too.
>>> + * They could be mistakenly treated as file lru. So further anon
>>> + * test is needed.
>>> + */
>>> + if (!folio_is_file_lru(folio) ||
>>> + (folio_test_anon(folio) &&
>>> !folio_test_swapbacked(folio)))
>>> + goto out;
>>> +
>>> + if (folio_test_dirty(folio) &&
>>> + (!current_is_kswapd() ||
>>> + !folio_test_reclaim(folio) ||
>>> + !test_bit(PGDAT_DIRTY, &pgdat->flags))) {
>>> + /*
>>> + * Immediately reclaim when written back.
>>> + * Similar in principle to folio_deactivate()
>>> + * except we already have the folio isolated
>>> + * and know it's dirty
>>> + */
>>> + node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
>>> + folio_nr_pages(folio));
>>> + folio_set_reclaim(folio);
>>> +
>>> + ret = false;
>>> + }
>>> +
>>> +out:
>>> + return ret;
>>> +}
>>> +
>>> static struct folio *alloc_demote_folio(struct folio *src,
>>> unsigned long private)
>>> {
>>> @@ -1078,6 +1116,12 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>> if (dirty && !writeback)
>>> stat->nr_unqueued_dirty += nr_pages;
>>>
>>> + /* If the dirty folio dose not support pageout,
>>> + * the dirty folio can skip this recycling.
>>> + */
>>> + if (!folio_check_pageout(folio, pgdat))
>>> + goto activate_locked;
>>> +
>>> /*
>>> * Treat this folio as congested if folios are cycling
>>> * through the LRU so quickly that the folios marked
>>> @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>> enum ttu_flags flags = TTU_BATCH_FLUSH;
>>> bool was_swapbacked =
>>> folio_test_swapbacked(folio);
>>>
>>> - if (folio_test_dirty(folio)) {
>>> - /*
>>> - * Only kswapd can writeback
>>> filesystem folios
>>> - * to avoid risk of stack overflow.
>>> But avoid
>>> - * injecting inefficient single-folio
>>> I/O into
>>> - * flusher writeback as much as
>>> possible: only
>>> - * write folios when we've encountered
>>> many
>>> - * dirty folios, and when we've
>>> already scanned
>>> - * the rest of the LRU for clean
>>> folios and see
>>> - * the same dirty folios again (with
>>> the reclaim
>>> - * flag set).
>>> - */
>>> - if (folio_is_file_lru(folio) &&
>>> - (!current_is_kswapd() ||
>>> - !folio_test_reclaim(folio) ||
>>> - !test_bit(PGDAT_DIRTY,
>>> &pgdat->flags))) {
>>> - /*
>>> - * Immediately reclaim when
>>> written back.
>>> - * Similar in principle to
>>> folio_deactivate()
>>> - * except we already have the
>>> folio isolated
>>> - * and know it's dirty
>>> - */
>>> - node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
>>> - nr_pages);
>>> - folio_set_reclaim(folio);
>>> -
>>> - goto activate_locked;
>>> - }
>>> -
>>> - if (references == FOLIOREF_RECLAIM_CLEAN)
>>> - goto keep_locked;
>>> - if (!may_enter_fs(folio, sc->gfp_mask))
>>> - goto keep_locked;
>>> - if (!sc->may_writepage)
>>> - goto keep_locked;
>>> - }
>>> -
>>> if (folio_test_pmd_mappable(folio))
>>> flags |= TTU_SPLIT_HUGE_PMD;
>>>
>>> @@ -1323,6 +1330,28 @@ static unsigned int shrink_folio_list(struct
>>> list_head *folio_list,
>>>
>>> mapping = folio_mapping(folio);
>>> if (folio_test_dirty(folio)) {
>>> + /*
>>> + * Only kswapd can writeback filesystem folios
>>> + * to avoid risk of stack overflow. But avoid
>>> + * injecting inefficient single-folio I/O into
>>> + * flusher writeback as much as possible: only
>>> + * write folios when we've encountered many
>>> + * dirty folios, and when we've already scanned
>>> + * the rest of the LRU for clean folios and see
>>> + * the same dirty folios again (with the reclaim
>>> + * flag set).
>>> + */
>>> + if (folio_is_file_lru(folio) &&
>>> + !folio_check_pageout(folio, pgdat))
>>> + goto activate_locked;
>>> +
>>> + if (references == FOLIOREF_RECLAIM_CLEAN)
>>> + goto keep_locked;
>>> + if (!may_enter_fs(folio, sc->gfp_mask))
>>> + goto keep_locked;
>>> + if (!sc->may_writepage)
>>> + goto keep_locked;
>>> +
>>> /*
>>> * Folio is dirty. Flush the TLB if a writable
>>> entry
>>> * potentially exists to avoid CPU writes
>>> after I/O
>>
>> I'm confused. Did you apply this on top of v1 by accident?
> Hi,
> According to my modified mm_vmscan_lru_shrink_inactive test tracelog,
> in the 32 scanned inactive file pages, 20 were dirty, and the 20 dirty
> pages were not reclamed, but they took 20us to perform try_to_unmap.
>
> I think unreclaimed dirty folio in inactive file lru can skip to
> perform try_to_unmap. Please help to continue review. Thanks.
>
> kswapd0-99 ( 99) [005] ..... 687.793724:
> mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32
> reclamed=12 nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0
> nr_congested=0 nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20
> nr_ref_keep=0 nr_unmap_fail=0 priority=2
> file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 exe=0 reference_cost=5
> reference_exe=0 unmap_cost=21 unmap_exe=0 dirty_unmap_cost=20
> dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0
>
To supplement, I think the unreclaimed dirty folio of the inactive file
lru in shrink_folio_list() can exit the recyling flow in advance and
avoid to execute some time-consuming interfaces, such as
folio_check_references() and try_to_unmap().
>> --
>> Cheers,
>>
>> David / dhildenb
>>
>
Powered by blists - more mailing lists