[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3f943f72-59d6-4124-96b2-e0bb8d7a5ebd@redhat.com>
Date: Mon, 4 Nov 2024 13:32:55 +0100
From: David Hildenbrand <david@...hat.com>
To: Barry Song <21cnbao@...il.com>, Usama Arif <usamaarif642@...il.com>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Barry Song <v-songbaohua@...o.com>,
Chengming Zhou <chengming.zhou@...ux.dev>,
Yosry Ahmed <yosryahmed@...gle.com>, Nhat Pham <nphamcs@...il.com>,
Johannes Weiner <hannes@...xchg.org>, Hugh Dickins <hughd@...gle.com>,
Matthew Wilcox <willy@...radead.org>, Shakeel Butt <shakeel.butt@...ux.dev>,
Andi Kleen <ak@...ux.intel.com>, Baolin Wang
<baolin.wang@...ux.alibaba.com>, Chris Li <chrisl@...nel.org>,
"Huang, Ying" <ying.huang@...el.com>, Kairui Song <kasong@...cent.com>,
Ryan Roberts <ryan.roberts@....com>
Subject: Re: [PATCH v2] mm: count zeromap read and set for swapout and swapin
On 02.11.24 13:59, Barry Song wrote:
> On Sat, Nov 2, 2024 at 8:32 PM Usama Arif <usamaarif642@...il.com> wrote:
>>
>>
>>
>> On 02/11/2024 10:12, Barry Song wrote:
>>> From: Barry Song <v-songbaohua@...o.com>
>>>
>>> When the proportion of folios from the zero map is small, missing their
>>> accounting may not significantly impact profiling. However, it’s easy
>>> to construct a scenario where this becomes an issue—for example,
>>> allocating 1 GB of memory, writing zeros from userspace, followed by
>>> MADV_PAGEOUT, and then swapping it back in. In this case, the swap-out
>>> and swap-in counts seem to vanish into a black hole, potentially
>>> causing semantic ambiguity.
>>>
>>> We have two ways to address this:
>>>
>>> 1. Add a separate counter specifically for the zero map.
>>> 2. Continue using the current accounting, treating the zero map like
>>> a normal backend. (This aligns with the current behavior of zRAM
>>> when supporting same-page fills at the device level.)
>>>
>>> This patch adopts option 1 as pswpin/pswpout counters are that they
>>> only apply to IO done directly to the backend device (as noted by
>>> Nhat Pham).
>>>
>>> We can find these counters from /proc/vmstat (counters for the whole
>>> system) and memcg's memory.stat (counters for the interested memcg).
>>>
>>> For example:
>>>
>>> $ grep -E 'swpin_zero|swpout_zero' /proc/vmstat
>>> swpin_zero 1648
>>> swpout_zero 33536
>>>
>>> $ grep -E 'swpin_zero|swpout_zero' /sys/fs/cgroup/system.slice/memory.stat
>>> swpin_zero 3905
>>> swpout_zero 3985
>>>
>>> Fixes: 0ca0c24e3211 ("mm: store zero pages to be swapped out in a bitmap")
>> I don't think its a hotfix (or even a fix). It was discussed in the initial
>> series to add these as a follow up and Joshua was going to do this soon.
>> Its not fixing any bug in the initial series.
>
> I would prefer that all kernel versions with zeromap include this
> counter; otherwise,
> it could be confusing to determine where swap-in and swap-out have occurred,
> as shown by the small program below:
>
> p =malloc(1g);
> write p to zero
> madvise_pageout
> read p;
>
> Previously, there was 1GB of swap-in and swap-out activity reported, but
> now nothing is shown.
>
> I don't mean to suggest that there's a bug in the zeromap code; rather,
> having this counter would help clear up any confusion.
>
> I didn't realize Joshua was handling it. Is he still planning to? If
> so, I can leave it
> with Joshua if that was the plan :-)
>
>>
>>> Cc: Usama Arif <usamaarif642@...il.com>
>>> Cc: Chengming Zhou <chengming.zhou@...ux.dev>
>>> Cc: Yosry Ahmed <yosryahmed@...gle.com>
>>> Cc: Nhat Pham <nphamcs@...il.com>
>>> Cc: Johannes Weiner <hannes@...xchg.org>
>>> Cc: David Hildenbrand <david@...hat.com>
>>> Cc: Hugh Dickins <hughd@...gle.com>
>>> Cc: Matthew Wilcox (Oracle) <willy@...radead.org>
>>> Cc: Shakeel Butt <shakeel.butt@...ux.dev>
>>> Cc: Andi Kleen <ak@...ux.intel.com>
>>> Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>
>>> Cc: Chris Li <chrisl@...nel.org>
>>> Cc: "Huang, Ying" <ying.huang@...el.com>
>>> Cc: Kairui Song <kasong@...cent.com>
>>> Cc: Ryan Roberts <ryan.roberts@....com>
>>> Signed-off-by: Barry Song <v-songbaohua@...o.com>
>>> ---
>>> -v2:
>>> * add separate counters rather than using pswpin/out; thanks
>>> for the comments from Usama, David, Yosry and Nhat;
>>> * Usama also suggested a new counter like swapped_zero, I
>>> prefer that one be separated as an enhancement patch not
>>> a hotfix. will probably handle it later on.
>>>
>> I dont think either of them would be a hotfix.
>
> As mentioned above, this isn't about fixing a bug; it's simply to ensure
> that swap-related metrics don't disappear.
Documentation/process/submitting-patches.rst:
"A Fixes: tag indicates that the patch fixes an issue in a previous
commit. It is used to make it easy to determine where a bug originated,
which can help review a bug fix."
If there is no BUG, I'm afraid you are abusing that tag.
Also, I don't really understand the problem? We added an optimization,
great. Who will be complaining about that?
Above you write "it could be confusing to determine where swap-in and
swap-out have occurred" -- when is that confusion supposed to happen in
practice? How will the confused individuals know that they must take a
look at that new metric, even if it is in place?
I think we should just add the new stats and call it a day.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists