[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210506134750.17d2f6eb@alex-virtual-machine>
Date: Thu, 6 May 2021 13:47:50 +0800
From: Aili Yao <yaoaili@...gsoft.com>
To: Michal Hocko <mhocko@...nel.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Michal Hocko <mhocko@...e.com>, <yaoaili126@...il.com>
Subject: Re: [PATCH] Revert "mm/gup: check page posion status for coredump."
On Wed, 5 May 2021 15:54:07 +0200
Michal Hocko <mhocko@...nel.org> wrote:
> From: Michal Hocko <mhocko@...e.com>
>
> While reviewing http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com
> I have crossed d3378e86d182 ("mm/gup: check page posion status for
> coredump.") and noticed that this patch is broken in two ways. First it
> doesn't really prevent hwpoison pages from being dumped because hwpoison
> pages can be marked asynchornously at any time after the check.
I rethink this:
There are two cases for this coredump panic issue.
One is the scenario that the hwpoison flag is set correctly, and the previous patch
will make it recoverable and avoid panic.
Another is the hwpoison flag not valid in the check, maybe race condition. I don't think
this case is worth and reliazable to be covered. As the SRAR can happen freshly in the dump
process and thus can't be detected.
And the previous patch doesn't make the Another case worse and unacceptable. just as it can't be
covered.
So here is the patch:
For most case in this topic, the patch will work. For the case hwpoison flag not valid, it will
fallback to the original process before this patch --- just panic.
And i don't think we need to consider the minor case as you have said the posion can happen any time.
Thanks!
Aili Yao
Powered by blists - more mailing lists