linux-kernel - Re: [PATCH v2 1/2] mm:vmscan: the dirty folio in folio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02e73251-33ff-4632-9d2c-bc268f397202@vivo.com>
Date:   Fri, 20 Oct 2023 11:59:33 +0800
From:   zhiguojiang <justinjiang@...o.com>
To:     David Hildenbrand <david@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Cc:     opensource.kernel@...o.com
Subject: Re: [PATCH v2 1/2] mm:vmscan: the dirty folio in folio_list skip
 unmap



在 2023/10/19 22:15, David Hildenbrand 写道:
> [你通常不会收到来自 david@...hat.com 的电子邮件。请访问 
> https://aka.ms/LearnAboutSenderIdentification，以了解这一点为什么很重要]
>
> On 19.10.23 15:14, Zhiguo Jiang wrote:
>> In the shrink_folio_list() the sources of the file dirty folio include
>> two ways below:
>> 1. The dirty folio is from the incoming parameter folio_list,
>>     which is the inactive file lru.
>> 2. The dirty folio is from the PTE dirty bit transferred by
>>     the try_to_unmap().
>>
>> For the first source of the dirty folio, if the dirty folio does not
>> support pageout, the dirty folio can skip unmap in advance to reduce
>> recyling time.
>>
>> Signed-off-by: Zhiguo Jiang <justinjiang@...o.com>
>> ---
>>
>> Changelog:
>> v1->v2:
>> 1. Keep the original judgment flow.
>> 2. Add the interface of folio_check_pageout().
>> 3. The dirty folio which does not support pageout in inactive file lru
>>     skip unmap in advance.
>>
>>   mm/vmscan.c | 103 +++++++++++++++++++++++++++++++++-------------------
>>   1 file changed, 66 insertions(+), 37 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index a68d01fcc307..e067269275a5 100755
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -925,6 +925,44 @@ static void folio_check_dirty_writeback(struct 
>> folio *folio,
>>               mapping->a_ops->is_dirty_writeback(folio, dirty, 
>> writeback);
>>   }
>>
>> +/* Check if a dirty folio can support pageout in the recyling process*/
>> +static bool folio_check_pageout(struct folio *folio,
>> +                                             struct pglist_data *pgdat)
>> +{
>> +     int ret = true;
>> +
>> +     /*
>> +      * Anonymous folios are not handled by flushers and must be 
>> written
>> +      * from reclaim context. Do not stall reclaim based on them.
>> +      * MADV_FREE anonymous folios are put into inactive file list too.
>> +      * They could be mistakenly treated as file lru. So further anon
>> +      * test is needed.
>> +      */
>> +     if (!folio_is_file_lru(folio) ||
>> +             (folio_test_anon(folio) && !folio_test_swapbacked(folio)))
>> +             goto out;
>> +
>> +     if (folio_test_dirty(folio) &&
>> +             (!current_is_kswapd() ||
>> +              !folio_test_reclaim(folio) ||
>> +              !test_bit(PGDAT_DIRTY, &pgdat->flags))) {
>> +             /*
>> +              * Immediately reclaim when written back.
>> +              * Similar in principle to folio_deactivate()
>> +              * except we already have the folio isolated
>> +              * and know it's dirty
>> +              */
>> +             node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
>> +                     folio_nr_pages(folio));
>> +             folio_set_reclaim(folio);
>> +
>> +             ret = false;
>> +     }
>> +
>> +out:
>> +     return ret;
>> +}
>> +
>>   static struct folio *alloc_demote_folio(struct folio *src,
>>               unsigned long private)
>>   {
>> @@ -1078,6 +1116,12 @@ static unsigned int shrink_folio_list(struct 
>> list_head *folio_list,
>>               if (dirty && !writeback)
>>                       stat->nr_unqueued_dirty += nr_pages;
>>
>> +             /* If the dirty folio dose not support pageout,
>> +              * the dirty folio can skip this recycling.
>> +              */
>> +             if (!folio_check_pageout(folio, pgdat))
>> +                     goto activate_locked;
>> +
>>               /*
>>                * Treat this folio as congested if folios are cycling
>>                * through the LRU so quickly that the folios marked
>> @@ -1261,43 +1305,6 @@ static unsigned int shrink_folio_list(struct 
>> list_head *folio_list,
>>                       enum ttu_flags flags = TTU_BATCH_FLUSH;
>>                       bool was_swapbacked = 
>> folio_test_swapbacked(folio);
>>
>> -                     if (folio_test_dirty(folio)) {
>> -                             /*
>> -                              * Only kswapd can writeback filesystem 
>> folios
>> -                              * to avoid risk of stack overflow. But 
>> avoid
>> -                              * injecting inefficient single-folio 
>> I/O into
>> -                              * flusher writeback as much as 
>> possible: only
>> -                              * write folios when we've encountered 
>> many
>> -                              * dirty folios, and when we've already 
>> scanned
>> -                              * the rest of the LRU for clean folios 
>> and see
>> -                              * the same dirty folios again (with 
>> the reclaim
>> -                              * flag set).
>> -                              */
>> -                             if (folio_is_file_lru(folio) &&
>> -                                     (!current_is_kswapd() ||
>> - !folio_test_reclaim(folio) ||
>> -                                      !test_bit(PGDAT_DIRTY, 
>> &pgdat->flags))) {
>> -                                     /*
>> -                                      * Immediately reclaim when 
>> written back.
>> -                                      * Similar in principle to 
>> folio_deactivate()
>> -                                      * except we already have the 
>> folio isolated
>> -                                      * and know it's dirty
>> -                                      */
>> -                                     node_stat_mod_folio(folio, 
>> NR_VMSCAN_IMMEDIATE,
>> -                                                     nr_pages);
>> -                                     folio_set_reclaim(folio);
>> -
>> -                                     goto activate_locked;
>> -                             }
>> -
>> -                             if (references == FOLIOREF_RECLAIM_CLEAN)
>> -                                     goto keep_locked;
>> -                             if (!may_enter_fs(folio, sc->gfp_mask))
>> -                                     goto keep_locked;
>> -                             if (!sc->may_writepage)
>> -                                     goto keep_locked;
>> -                     }
>> -
>>                       if (folio_test_pmd_mappable(folio))
>>                               flags |= TTU_SPLIT_HUGE_PMD;
>>
>> @@ -1323,6 +1330,28 @@ static unsigned int shrink_folio_list(struct 
>> list_head *folio_list,
>>
>>               mapping = folio_mapping(folio);
>>               if (folio_test_dirty(folio)) {
>> +                     /*
>> +                      * Only kswapd can writeback filesystem folios
>> +                      * to avoid risk of stack overflow. But avoid
>> +                      * injecting inefficient single-folio I/O into
>> +                      * flusher writeback as much as possible: only
>> +                      * write folios when we've encountered many
>> +                      * dirty folios, and when we've already scanned
>> +                      * the rest of the LRU for clean folios and see
>> +                      * the same dirty folios again (with the reclaim
>> +                      * flag set).
>> +                      */
>> +                     if (folio_is_file_lru(folio) &&
>> +                             !folio_check_pageout(folio, pgdat))
>> +                             goto activate_locked;
>> +
>> +                     if (references == FOLIOREF_RECLAIM_CLEAN)
>> +                             goto keep_locked;
>> +                     if (!may_enter_fs(folio, sc->gfp_mask))
>> +                             goto keep_locked;
>> +                     if (!sc->may_writepage)
>> +                             goto keep_locked;
>> +
>>                       /*
>>                        * Folio is dirty. Flush the TLB if a writable 
>> entry
>>                        * potentially exists to avoid CPU writes after 
>> I/O
>
> I'm confused. Did you apply this on top of v1 by accident?
Hi,
According to my modified mm_vmscan_lru_shrink_inactive test tracelog, in 
the 32 scanned inactive file pages, 20 were dirty, and the 20 dirty 
pages were not reclamed, but they took 20us to perform try_to_unmap.

I think unreclaimed dirty folio in inactive file lru can skip to perform 
try_to_unmap. Please help to continue review. Thanks.

kswapd0-99      (     99) [005] .....   687.793724: 
mm_vmscan_lru_shrink_inactive: [Justin] nid 0 scan=32 isolate=32 
reclamed=12 nr_dirty=20 nr_unqueued_dirty=20 nr_writeback=0 
nr_congested=0 nr_immediate=0 nr_activate[0]=0 nr_activate[1]=20 
nr_ref_keep=0 nr_unmap_fail=0 priority=2 
file=RECLAIM_WB_FILE|RECLAIM_WB_ASYNC total=39 exe=0 reference_cost=5 
reference_exe=0 unmap_cost=21 unmap_exe=0 dirty_unmap_cost=20 
dirty_unmap_exe=0 pageout_cost=0 pageout_exe=0
>
> -- 
> Cheers,
>
> David / dhildenb
>