[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210506152048.2baefb05@alex-virtual-machine>
Date: Thu, 6 May 2021 15:20:48 +0800
From: Aili Yao <yaoaili@...gsoft.com>
To: Michal Hocko <mhocko@...e.com>
CC: Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>, <yaoaili126@...il.com>
Subject: Re: [PATCH] Revert "mm/gup: check page posion status for coredump."
On Thu, 6 May 2021 09:02:50 +0200
Michal Hocko <mhocko@...e.com> wrote:
> On Thu 06-05-21 13:47:50, Aili Yao wrote:
> > On Wed, 5 May 2021 15:54:07 +0200
> > Michal Hocko <mhocko@...nel.org> wrote:
> >
> > > From: Michal Hocko <mhocko@...e.com>
> > >
> > > While reviewing http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com
> > > I have crossed d3378e86d182 ("mm/gup: check page posion status for
> > > coredump.") and noticed that this patch is broken in two ways. First it
> > > doesn't really prevent hwpoison pages from being dumped because hwpoison
> > > pages can be marked asynchornously at any time after the check.
> >
> > I rethink this:
> > There are two cases for this coredump panic issue.
> > One is the scenario that the hwpoison flag is set correctly, and the previous patch
> > will make it recoverable and avoid panic.
> >
> > Another is the hwpoison flag not valid in the check, maybe race condition. I don't think
> > this case is worth and reliazable to be covered. As the SRAR can happen freshly in the dump
> > process and thus can't be detected.
> >
> > And the previous patch doesn't make the Another case worse and unacceptable. just as it can't be
> > covered.
> >
> > So here is the patch:
> > For most case in this topic, the patch will work. For the case hwpoison flag not valid, it will
> > fallback to the original process before this patch --- just panic.
>
> Please propose a new fix which a) doesn't leak a page reference b)
> evaluates how realistic is the scenario
Got this, Thanks, I will dig into it and try to fix the leak. And There will be more comments on the
scenario that the issue will be triggered.
> c) explain why any other gup
> user doesn't really need to care - or in other words is the gup layer
> really suitable for this issue?
For SIGBUS coredump case, we will call the gup module for dump pages. For normal hwposion case, the gup module
will check the pte entry for hwpoison case, ans this issue is for another case for hwpoison. Maybe it's easy to
fix this issue in gup module.
Thanks!
Aili Yao
Powered by blists - more mailing lists