lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 10 Apr 2024 10:10:37 +0200
From: David Hildenbrand <david@...hat.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 linux-doc@...r.kernel.org, cgroups@...r.kernel.org,
 linux-sh@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 linux-fsdevel@...r.kernel.org, Andrew Morton <akpm@...ux-foundation.org>,
 Peter Xu <peterx@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
 Yin Fengwei <fengwei.yin@...el.com>, Yang Shi <shy828301@...il.com>,
 Zi Yan <ziy@...dia.com>, Jonathan Corbet <corbet@....net>,
 Hugh Dickins <hughd@...gle.com>, Yoshinori Sato
 <ysato@...rs.sourceforge.jp>, Rich Felker <dalias@...c.org>,
 John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>,
 Chris Zankel <chris@...kel.net>, Max Filippov <jcmvbkbc@...il.com>,
 Muchun Song <muchun.song@...ux.dev>, Miaohe Lin <linmiaohe@...wei.com>,
 Naoya Horiguchi <naoya.horiguchi@....com>,
 Richard Chang <richardycc@...gle.com>
Subject: Re: [PATCH v1 01/18] mm: allow for detecting underflows with
 page_mapcount() again

On 09.04.24 23:42, Matthew Wilcox wrote:
> On Tue, Apr 09, 2024 at 09:22:44PM +0200, David Hildenbrand wrote:
>> Commit 53277bcf126d ("mm: support page_mapcount() on page_has_type()
>> pages") made it impossible to detect mapcount underflows by treating
>> any negative raw mapcount value as a mapcount of 0.
> 
> Yes, but I don't think this is the right place to check for underflow.
> We should be checking for that on modification, not on read.

While I don't disagree (and we'd check more instances that way, for example
deferred rmap removal), that requires a bit more churn and figuring out of
if losing some information we would have printed in print_bad_pte() is worth
that change.

> I think
> it's more important for page_mapcount() to be fast than a debugging aid.

I really don't think page_mapcount() is a good use of time for
micro-optimizations, but let's investigate:

A big hunk of code in page_mapcount() seems to be the compound handling.
The code before that (reading mapcount, checking for the condition,
conditionally setting it to 0), would generate right now:

  177:	8b 42 30             	mov    0x30(%rdx),%eax
  17a:   b9 00 00 00 00          mov    $0x0,%ecx
  17f:	83 c0 01             	add    $0x1,%eax
  182:	0f 48 c1             	cmovs  %ecx,%eax

My variant is longer:

  17b:	8b 4a 30             	mov    0x30(%rdx),%ecx
  17e:	81 f9 7f ff ff ff    	cmp    $0xffffff7f,%ecx
  184:	8d 41 01             	lea    0x1(%rcx),%eax
  187:	b9 00 00 00 00       	mov    $0x0,%ecx
  18c:	0f 4e c1             	cmovle %ecx,%eax
  18f:	48 8b 0a             	mov    (%rdx),%rcx

The compiler does not seem to do the smart thing, which would
be rearranging the code to effectively be:

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ef34cf54c14f..7392596882ae 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1232,7 +1232,7 @@ static inline int page_mapcount(struct page *page)
         int mapcount = atomic_read(&page->_mapcount) + 1;
  
         /* Handle page_has_type() pages */
-       if (mapcount < 0)
+       if (mapcount < PAGE_MAPCOUNT_RESERVE + 1)
                 mapcount = 0;
         if (unlikely(PageCompound(page)))
                 mapcount += folio_entire_mapcount(page_folio(page));


Which would result in:

  177:   8b 42 30                mov    0x30(%rdx),%eax
  17a:   31 c9                   xor    %ecx,%ecx
  17c:   83 c0 01                add    $0x1,%eax
  17f:   83 f8 80                cmp    $0xffffff80,%eax
  182:   0f 4e c1                cmovle %ecx,%eax


Same code length, one more instruction. No jumps.


I can switch to the above (essentially inlining
page_type_has_type()) for now and look into different sanity checks --
and extending the documentation around page_mapcount() behavior for
underflows -- separately.

.. unless you insist that we really have to change that immediately.

Thanks!

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ