[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YJOVZlFGcSG+mmIk@dhcp22.suse.cz>
Date: Thu, 6 May 2021 09:06:14 +0200
From: Michal Hocko <mhocko@...e.com>
To: Aili Yao <yaoaili@...gsoft.com>
Cc: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Alexey Dobriyan <adobriyan@...il.com>,
Mike Rapoport <rppt@...nel.org>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Oscar Salvador <osalvador@...e.de>,
Roman Gushchin <guro@...com>,
Alex Shi <alex.shi@...ux.alibaba.com>,
Steven Price <steven.price@....com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Jiri Bohac <jbohac@...e.cz>,
"K. Y. Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>,
Naoya Horiguchi <naoya.horiguchi@....com>,
linux-hyperv@...r.kernel.org,
virtualization@...ts.linux-foundation.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
yaoaili126@...il.com
Subject: Re: [PATCH v1 3/7] mm: rename and move page_is_poisoned()
On Thu 06-05-21 08:56:11, Aili Yao wrote:
> On Wed, 5 May 2021 15:27:39 +0200
> Michal Hocko <mhocko@...e.com> wrote:
>
> > On Wed 05-05-21 15:17:53, David Hildenbrand wrote:
> > > On 05.05.21 15:13, Michal Hocko wrote:
> > > > On Thu 29-04-21 14:25:15, David Hildenbrand wrote:
> > > > > Commit d3378e86d182 ("mm/gup: check page posion status for coredump.")
> > > > > introduced page_is_poisoned(), however, v5 [1] of the patch used
> > > > > "page_is_hwpoison()" and something went wrong while upstreaming. Rename the
> > > > > function and move it to page-flags.h, from where it can be used in other
> > > > > -- kcore -- context.
> > > > >
> > > > > Move the comment to the place where it belongs and simplify.
> > > > >
> > > > > [1] https://lkml.kernel.org/r/20210322193318.377c9ce9@alex-virtual-machine
> > > > >
> > > > > Signed-off-by: David Hildenbrand <david@...hat.com>
> > > >
> > > > I do agree that being explicit about hwpoison is much better. Poisoned
> > > > page can be also an unitialized one and I believe this is the reason why
> > > > you are bringing that up.
> > >
> > > I'm bringing it up because I want to reuse that function as state above :)
> > >
> > > >
> > > > But you've made me look at d3378e86d182 and I am wondering whether this
> > > > is really a valid patch. First of all it can leak a reference count
> > > > AFAICS. Moreover it doesn't really fix anything because the page can be
> > > > marked hwpoison right after the check is done. I do not think the race
> > > > is feasible to be closed. So shouldn't we rather revert it?
> > >
> > > I am not sure if we really care about races here that much here? I mean,
> > > essentially we are racing with HW breaking asynchronously. Just because we
> > > would be synchronizing with SetPageHWPoison() wouldn't mean we can stop HW
> > > from breaking.
> >
> > Right
> >
> > > Long story short, this should be good enough for the cases we actually can
> > > handle? What am I missing?
> >
> > I am not sure I follow. My point is that I fail to see any added value
> > of the check as it doesn't prevent the race (it fundamentally cannot as
> > the page can be poisoned at any time) but the failure path doesn't
> > put_page which is incorrect even for hwpoison pages.
>
> Sorry, I have something to say:
>
> I have noticed the ref count leak in the previous topic ,but I don't think
> it's a really matter. For memory recovery case for user pages, we will keep one
> reference to the poison page so the error page will not be freed to buddy allocator.
> which can be checked in memory_faulure() function.
So what would happen if those pages are hwpoisoned from userspace rather
than by HW. And repeatedly so?
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists