linux-kernel - Re: [PATCH 2/2] mm: zswap: remove unnecessary tree cleanups in zswap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <68b7f87b-7f15-410c-9094-9fe9865b4bc4@bytedance.com>
Date: Thu, 25 Jan 2024 17:38:21 +0800
From: Chengming Zhou <zhouchengming@...edance.com>
To: Yosry Ahmed <yosryahmed@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
 Andrew Morton <akpm@...ux-foundation.org>, Nhat Pham <nphamcs@...il.com>,
 Chris Li <chrisl@...nel.org>, Huang Ying <ying.huang@...el.com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: zswap: remove unnecessary tree cleanups in
 zswap_swapoff()

On 2024/1/25 17:26, Yosry Ahmed wrote:
> On Thu, Jan 25, 2024 at 1:22 AM Chengming Zhou
> <zhouchengming@...edance.com> wrote:
>>
>> On 2024/1/25 17:03, Yosry Ahmed wrote:
>>>>>>>> The second difference is the handling of lru entry, which is easy that we
>>>>>>>> just zswap_lru_del() in tree lock.
>>>>>>>
>>>>>>> Why do we need zswap_lru_del() at all? We should have already isolated
>>>>>>> the entry at that point IIUC.
>>>>>>
>>>>>> I was thinking how to handle the "zswap_lru_putback()" if not writeback,
>>>>>> in which case we can't use the entry actually since we haven't got reference
>>>>>> of it. So we can don't isolate at the entry, and only zswap_lru_del() when
>>>>>> we are going to writeback actually.
>>>>>
>>>>> Why not just call zswap_lru_putback() before we unlock the folio?
>>>>
>>>> When early return because __read_swap_cache_async() return NULL or !folio_was_allocated,
>>>> we don't have a locked folio yet. The entry maybe invalidated and freed concurrently.
>>>
>>> Oh, that path, right.
>>>
>>> If we don't isolate the entry straightaway, concurrent reclaimers will
>>> see the same entry, call __read_swap_cache_async(), find the folio
>>> already in the swapcache and stop shrinking. This is because usually
>>> this means we are racing with swapin and hitting the warmer part of
>>> the zswap LRU.
>>>
>>> I am not sure if this would matter in practice, maybe Nhat knows
>>> better. Perhaps we can rotate the entry in the LRU before calling
>>> __read_swap_cache_async() to minimize the chances of such a race? Or
>>> we can serialize the calls to __read_swap_cache_async() but this may
>>> be an overkill.
>>
>> Also, not sure, rotate the entry maybe good IMHO since we will zswap_lru_del()
>> once we checked the invalidate race.
> 
> Not sure what you mean. If we rotate first, we won't have anything to
> do in the failure case (if the folio is not locked). We will have to
> do zswap_lru_del() if actually writeback, yes, but in this case no
> synchronization is needed because the folio is locked, right?

Right, sorry for my confusing expression. We rotate first in lru lock,
and only zswap_lru_del() later if actually writeback.

What I want to mean is that the possibility of seeing the entry on lru list
by another reclaimer is very low, since we rotate and the timing between
__read_swap_cache_async() and zswap_lru_del() should be short.