lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 5 Aug 2015 21:15:57 -0700 (PDT)
From:	Hugh Dickins <hughd@...gle.com>
To:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	David Rientjes <rientjes@...gle.com>,
	Hugh Dickins <hughd@...gle.com>,
	Dave Hansen <dave.hansen@...el.com>,
	Mel Gorman <mgorman@...e.de>, Rik van Riel <riel@...hat.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Christoph Lameter <cl@...two.org>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Steve Capper <steve.capper@...aro.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Hocko <mhocko@...e.cz>,
	Jerome Marchand <jmarchan@...hat.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: page-flags behavior on compound pages: a worry

Hi Kirill,

I had a nasty thought this morning.

Andrew had prodded me gently to re-examine my concerns with your
page-flags rework in mmotm.  I still dislike the bloat (my mm/built-in.o
text goes up from 478513 to 490183 bytes on a non-DEBUG_VM build); but I
was hoping to set that aside, to let us move forward.

But looking into the bloat led me to what seems a more serious issue
with it.  I'd tacked a little function on to the end of mm/filemap.c:

bool page_is_locked(struct page *page)
{
	return !!PageLocked(page);
}

which came out as:

0000000000003a60 <page_is_locked>:
    3a60:	48 8b 07             	mov    (%rdi),%rax
    3a63:	55                   	push   %rbp
    3a64:	48 89 e5             	mov    %rsp,%rbp

[instructions above same as without your patches; those below added by them]

    3a67:	f6 c4 80             	test   $0x80,%ah
    3a6a:	74 10                	je     3a7c <page_is_locked+0x1c>
    3a6c:	48 8b 47 30          	mov    0x30(%rdi),%rax
    3a70:	48 8b 17             	mov    (%rdi),%rdx
    3a73:	80 e6 80             	and    $0x80,%dh
    3a76:	48 0f 44 c7          	cmove  %rdi,%rax
    3a7a:	eb 03                	jmp    3a7f <page_is_locked+0x1f>
    3a7c:	48 89 f8             	mov    %rdi,%rax
    3a7f:	48 8b 00             	mov    (%rax),%rax

[instructions above added by your patches; those below same as before]

    3a82:	5d                   	pop    %rbp
    3a83:	83 e0 01             	and    $0x1,%eax
    3a86:	c3                   	retq   

The "and $0x80,%dh" looked superfluous at first, but of course it isn't:
it's from the smp_rmb() in David's 668f9abbd433 "mm: close PageTail race"
(a later commit refactors compound_head() but doesn't change the story).

And it's that race, or a worse race of that kind, that now worries me.
Relying on smp_wmb() and smp_rmb() may be all that was needed in the
case that David was fixing; and (I dare not look at them to audit!)
all uses of compound_head() in our current v4.2-rc tree may well be
safe, for this or that contingent reason in each place that it's used.

But there is no locking within compound_head(page) to make it safe
everywhere, yet your page-flags rework is changing a large number
of PageWhatever()s and SetPageWhatever()s and ClearPageWhatever()s
now to do a hidden compound_head(page) beneath the covers.

To be more specific: if preemption, or an interrupt, or entry to SMM
mode, or whatever, delays this thread somewhere in that compound_head()
sequence of instructions, how can we be sure that the "head" returned
by compound_head() is good?  We know the page was PageTail just before
looking up page->first_page, and we know it was PageTail just after,
but we don't know that it was PageTail throughout, and we don't know
whether page->first_page is even a good page pointer, or something
else from the private/ptl/slab_cache union.

Of course it would be very rare for it to go wrong; and most callsites
will obviously be safe for this or that reason; though, sadly, none of
them safe from holding a reference to the tail page in question, since
its count is frozen at 0 and cannot be grabbed by get_page_unless_zero.

But I don't see how it can be safe to rely on compound_head() inside
a general purpose page-flag function, that we're all accustomed to
think of as a simple bitop, that can be applied without great care.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ