linux-kernel - Re: [PATCH mmotm] mm: filemap_unaccount

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <3251e132-35e0-8185-b528-faaa6f3c612@google.com>
Date:   Thu, 3 Mar 2022 21:26:15 -0800 (PST)
From:   Hugh Dickins <hughd@...gle.com>
To:     Matthew Wilcox <willy@...radead.org>
cc:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH mmotm] mm: filemap_unaccount_folio() large skip mapcount
 fixup

On Fri, 4 Mar 2022, Matthew Wilcox wrote:
> On Thu, Mar 03, 2022 at 08:21:19PM -0800, Hugh Dickins wrote:
> > The page_mapcount_reset() when folio_mapped() while mapping_exiting()
> > was devised long before there were huge or compound pages in the cache.
> > It is still valid for small pages, but not at all clear what's right to
> > check and reset on large pages.  Just don't try when folio_test_large().
> 
> Thanks for bringing this up!  I was really unsure about this chunk of code
> when converting unaccount_page_cache_page() to filemap_unaccount_folio().
> 
> Part of me wants to just delete the whole thing.

A part of me feels the same way.  Accepting to waste 2MB while
footling around over 4kB doesn't help my case for that code -
whose beauty is, umm, questionable.

> I'm unconvinced by
> the argument; surely it's better to leak memory than perhaps reuse a
> page which should not have been freed yet?

I know people who would agree with you.  And in most cases we do prefer
to waste it, because it has become suspicious.

But by far the most common reason for getting here, is that a page table
has got corrupted in some way, so the pte or pmd was never identified to
unmap the page.  Nothing suspect about the page itself, and (barring a
bug in vma unlinking) all the mappings which might hold the page have
been unmapped: we've just got a leftover in the mapcount.

> 
> Also, the code doesn't take into account that folio_mapped() is freaking
> expensive for THP (512 cache lines, blowing away 32kB of your L1 cache!),
> and we may as well calculate folio_mapcount() while we're doing it.

Well, we don't need to optimize the rare case.

As you know, I've long thought that there should be one quick total
mapcount field; but it's never been a priority to work out the details
on that (and Linus seems to be driving mapcount out of fashion; and
you have your own ideas there).

> 
> Do you see this report often on machines that don't have
> VM_BUG_ON_FOLIO() enabled?

No, not often at all.  Almost never.  I think it did come up occasionally
when that code was written, but seems to have faded away in recent years.
Cosmic rays getting weaker, or memory getting better, or fewer bugs perhaps
- I haven't looked back to check, may be wrong, but I think when I added
unmap_mapping_page() last year (now s/page/folio/), that was to fix one
real (but benign) cause of such warnings.

Hugh