[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bDHkOQbTrK=GbsGbojAj_6gaAX_8w3cBCd_LWqXt--yZA@mail.gmail.com>
Date: Wed, 26 Jan 2022 14:22:26 -0500
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
linux-m68k@...ts.linux-m68k.org,
Anshuman Khandual <anshuman.khandual@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
william.kucharski@...cle.com,
Mike Kravetz <mike.kravetz@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
schmitzmic@...il.com, Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...hat.com>,
Johannes Weiner <hannes@...xchg.org>,
Roman Gushchin <guro@...com>,
Muchun Song <songmuchun@...edance.com>,
Wei Xu <weixugc@...gle.com>, Greg Thelen <gthelen@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Paul Turner <pjt@...gle.com>, Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH v3 1/9] mm: add overflow and underflow checks for page->_refcount
On Wed, Jan 26, 2022 at 1:59 PM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Wed, Jan 26, 2022 at 06:34:21PM +0000, Pasha Tatashin wrote:
> > The problems with page->_refcount are hard to debug, because usually
> > when they are detected, the damage has occurred a long time ago. Yet,
> > the problems with invalid page refcount may be catastrophic and lead to
> > memory corruptions.
> >
> > Reduce the scope of when the _refcount problems manifest themselves by
> > adding checks for underflows and overflows into functions that modify
> > _refcount.
>
> If you're chasing a bug like this, presumably you turn on page
> tracepoints. So could we reduce the cost of this by putting the
> VM_BUG_ON_PAGE parts into __page_ref_mod() et al? Yes, we'd need to
> change the arguments to those functions to pass in old & new, but that
> should be a cheap change compared to embedding the VM_BUG_ON_PAGE.
This is not only about chasing a bug. This also about preventing
memory corruption and information leaking that are caused by ref_count
bugs from happening.
Several months ago a memory corruption bug was discovered by accident:
an engineer was studying a process core from a production system and
noticed that some memory does not look like it belongs to the original
process. We tried to manually reproduce that bug but failed. However,
later analysis by our team, explained that the problem occured due to
ref_count bug in Linux, and the bug itself was root caused and fixed
(mentioned in the cover letter). This work would have prevented
similar ref_count bugs from yielding to the memory corruption
situation.
Pasha
Powered by blists - more mailing lists