lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <edab8d46-7c03-4bf5-fe1c-0150cdf4d96a@google.com>
Date: Fri, 6 Dec 2024 00:24:54 -0800 (PST)
From: Hugh Dickins <hughd@...gle.com>
To: Chen Ridong <chenridong@...weicloud.com>
cc: Yu Zhao <yuzhao@...gle.com>, Hugh Dickins <hughd@...gle.com>, 
    akpm@...ux-foundation.org, mhocko@...nel.org, hannes@...xchg.org, 
    yosryahmed@...gle.com, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev, 
    muchun.song@...ux.dev, davidf@...eo.com, vbabka@...e.cz, 
    linux-mm@...ck.org, linux-kernel@...r.kernel.org, cgroups@...r.kernel.org, 
    chenridong@...wei.com, wangweiyang2@...wei.com
Subject: Re: [next -v1 3/5] memcg: simplify the mem_cgroup_update_lru_size
 function

On Fri, 6 Dec 2024, Chen Ridong wrote:
> On 2024/12/6 13:33, Yu Zhao wrote:
> > On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@...weicloud.com> wrote:
> >> From: Chen Ridong <chenridong@...wei.com>
> >>
> >> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
> >> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
> >> than 0 or less than 0. To simplify this function, add a check for
> >> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
> >> actions.
> >>
> >> Signed-off-by: Chen Ridong <chenridong@...wei.com>
> > 
> > NAK.
> > 
> > The commit that added that clearly explains why it was done that way.

Thanks Yu: I did spot this going by, but was indeed hoping that someone
else would NAK it, with more politeness than I could muster at the time!

> 
> Thank you for your reply.
> 
> I have read the commit message for ca707239e8a7 ("mm: update_lru_size
> warn and reset bad lru_size") before sending my patch. However, I did
> not quite understand why we need to deal with the difference between
> 'nr_pages > 0' and 'nr_pages < 0'.
> 
> 
> The 'lru_zone_size' can only be modified in the
> 'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
> adding 'nr_pages' in a way that causes an overflow can make the size < 0.
> 
> For case 1, subtracting 'nr_pages', we should issue a warning if the
> size goes below 0. For case 2, when adding 'nr_pages' results in an
> overflow, there will be no warning and the size won't be reset to 0 the
> first time it occurs . It might be that a warning will be issued the
> next time 'mem_cgroup_update_lru_size' is called to modify the
> 'lru_zone_size'. However, as the commit message said, "the first
> occurrence is the most informative," and it seems we have missed that
> first occurrence.
> 
> As the commit message said: "and then the vast unsigned long size draws
> page reclaim into a loop of repeatedly", I think that a warning should
> be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
> for the first time, whether from subtracting or adding 'nr_pages'.

One thing, not obvious, but important to understand, is that WARN_ONCE()
only issues a warning the first time its condition is found true, but
it returns the true or false of the condition every time.  So lru_size
is forced to 0 each time an inconsistency is detected.

(But IIRC VM_WARN_ON_ONCE() does not behave in that useful way; or maybe
I've got that macro name wrong - we have so many slightly differing.)

Perhaps understanding that will help you to make better sense of the
order of events in this function.

Another thing to understand: it's called before adding folio to list,
but after removing folio from list: when it can usefully compare whether
the emptiness of the list correctly matches lru_size 0.  It cannot do so
when adding if you "simplify" it in the way that you did.

You might be focusing too much on the "size < 0" aspect of it, or you
might be worrying more than I did about size growing larger and larger
until it wraps to negative (not likely on 64-bit, unless corrupted).

I hope these remarks help, but you need to think through it again
for yourself.

Hugh

> 
> I am be grateful if you can explain more details, it has confused me for
> a while. Thank you very much.
> 
> Best regards,
> Ridong

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ