lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y3vI58VtjiAkorUX@cmpxchg.org>
Date:   Mon, 21 Nov 2022 13:52:23 -0500
From:   Johannes Weiner <hannes@...xchg.org>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Matthew Wilcox <willy@...radead.org>,
        David Hildenbrand <david@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>, Peter Xu <peterx@...hat.com>,
        Yang Shi <shy828301@...il.com>,
        John Hubbard <jhubbard@...dia.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Sidhartha Kumar <sidhartha.kumar@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        Mina Almasry <almasrymina@...gle.com>,
        James Houghton <jthoughton@...gle.com>,
        Zach O'Keefe <zokeefe@...gle.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 0/3] mm,thp,rmap: rework the use of subpages_mapcount

On Mon, Nov 21, 2022 at 04:59:38PM +0000, Shakeel Butt wrote:
> On Fri, Nov 18, 2022 at 01:08:13AM -0800, Hugh Dickins wrote:
> > Linus was underwhelmed by the earlier compound mapcounts series:
> > this series builds on top of it (as in next-20221117) to follow
> > up on his suggestions - except rmap.c still using lock_page_memcg(),
> > since I hesitate to steal the pleasure of deletion from Johannes.
> 
> Is there a plan to remove lock_page_memcg() altogether which I missed? I
> am planning to make lock_page_memcg() a nop for cgroup-v2 (as it shows
> up in the perf profile on exit path) but if we are removing it then I
> should just wait.

We can remove it for rmap at least, but we might be able to do more.

Besides rmap, we're left with the dirty and writeback page transitions
that wrt cgroups need to be atomic with NR_FILE_DIRTY and NR_WRITEBACK.

Looking through the various callsites, I think we can delete it from
setting and clearing dirty state, as we always hold the page lock (or
the pte lock in some instances of folio_mark_dirty). Both of these are
taken from the cgroup side, so we're good there.

I think we can also remove it when setting writeback, because those
sites have the page locked as well.

That leaves clearing writeback. This can't hold the page lock due to
the atomic context, so currently we need to take lock_page_memcg() as
the lock of last resort.

I wonder if we can have cgroup take the xalock instead: writeback
ending on file pages always acquires the xarray lock. Swap writeback
currently doesn't, but we could make it so (swap_address_space).

The only thing that gives me pause is the !mapping check in
__folio_end_writeback. File and swapcache pages usually have mappings,
and truncation waits for writeback to finish before axing
page->mapping. So AFAICS this can only happen if we call end_writeback
on something that isn't under writeback - in which case the test_clear
will fail and we don't update the stats anyway. But I want to be sure.

Does anybody know from the top of their heads if a page under
writeback could be without a mapping in some weird cornercase?

If we could ensure that the NR_WRITEBACK decs are always protected by
the xalock, we could grab it from mem_cgroup_move_account(), and then
kill lock_page_memcg() altogether.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ