lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Jun 2023 20:23:59 +0530
From:   Charan Teja Kalla <quic_charante@...cinc.com>
To:     Johannes Weiner <hannes@...xchg.org>
CC:     <akpm@...ux-foundation.org>, <minchan@...nel.org>,
        <quic_pkondeti@...cinc.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>,
        Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: [PATCH] mm: madvise: fix uneven accounting of psi

Thanks Johannes for the detailed review comments...

On 6/5/2023 11:30 PM, Johannes Weiner wrote:
>> Agree that we shouldn't be really silence the thrashing. My point is we
>> shouldn't be  considering the folios as thrashing If those were getting
>> reclaim by the user him self through MADV_PAGEOUT under the assumption
>> that __user knows they are not real working set__.  Please let me know
>> if I am not making sense here.
> I'm not sure I agree with this. I think it misses the point of what
> the madvise is actually for.
> 
> The workingset is defined based on access frequency and available
> memory. Thrashing is defined as having to read pages back shortly
> after their eviction.
> 
> MADV_PAGEOUT is for the application to inform the kernel that it's
> done accessing the pages, so that the kernel can accelerate their
> eviction over other pages that may still be in use. This is ultimately
> meant to REDUCE reclaim and paging.
> 
> However, in this case, the MADVISE_PAGEOUT evicts pages that are
> reused after and then refault. It INCREASED reclaim and paging.
> 
I agree here...
> Surely that's a problem? And the system would have behaved better
> without the madvise() in the first place?
> 
Yes, the system behavior could be much better without this PAGEOUT
operation...
> In fact, I would argue that the pressure spike is a great signal for
> detecting overzealous madvising. If you're redefining the workingset
> from access frequency to "whatever the user is saying", that will take
> away an important mechanism to detect advise bugs and unnecessary IO.
currently wanted to do the PAGEOUT operation but what information lacks
is if I am really operating on the workingset pages. Had the client
knows that he is operating on the workingset pages, he could have backed
off from madvising.

I now note that I shouldn't be defining the workingset from "whatever
user is saying". But then, IMO, there should be a way from the kernel to
the user that his madvise operation is being performed on the workingset
pages.

One way the user can do is monitoring the PSI events while PAGEOUT is
being performed and he may exclude those VMA's from next time.

Alternatively kernel itself can support it may be through like
MADV_PAGEOUT_INACTIVE which doesn't pageout the Workingset pages.

Please let me know your opinion about this interface.

This has the usecase on android where it just assumes that 2nd
background app will most likely to be not used in the future thus
reclaim those app pages. It works well for most of the times but such
assumption will go wrong with the usecase I had mentioned.

--Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ