[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y9e/8jtcDzSp9Ix2@dhcp22.suse.cz>
Date: Mon, 30 Jan 2023 14:02:42 +0100
From: Michal Hocko <mhocko@...e.com>
To: Kefeng Wang <wangkefeng.wang@...wei.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, Jens Axboe <axboe@...nel.dk>,
Jan Kara <jack@...e.cz>, Shakeel Butt <shakeelb@...gle.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Ma Wupeng <mawupeng1@...wei.com>, shy828301@...il.com
Subject: Re: [PATCH] mm: memcg: fix NULL pointer in
mem_cgroup_track_foreign_dirty()
On Mon 30-01-23 20:20:16, Kefeng Wang wrote:
>
>
> On 2023/1/30 16:48, Michal Hocko wrote:
> > On Mon 30-01-23 09:16:13, Kefeng Wang wrote:
> > >
> > >
> > > On 2023/1/30 5:48, Andrew Morton wrote:
> > > > On Sun, 29 Jan 2023 10:44:51 +0800 Kefeng Wang <wangkefeng.wang@...wei.com> wrote:
> > > >
> > > > > As commit 18365225f044 ("hwpoison, memcg: forcibly uncharge LRU pages"),
> > > >
> > > > Merged in 2017.
> > > >
> > > > > hwpoison will forcibly uncharg a LRU hwpoisoned page, the folio_memcg
> > > > > could be NULl, then, mem_cgroup_track_foreign_dirty_slowpath() could
> > > > > occurs a NULL pointer dereference, let's do not record the foreign
> > > > > writebacks for folio memcg is null in mem_cgroup_track_foreign() to
> > > > > fix it.
> > > > >
> > > > > Reported-by: Ma Wupeng <mawupeng1@...wei.com>
> > > > > Fixes: 97b27821b485 ("writeback, memcg: Implement foreign dirty flushing")
> > > >
> > > > Merged in 2019.
> > > >
> ...
> >
> > Just to make sure I understand. The page has been hwpoisoned, uncharged
> > but stayed in the page cache so a next page fault on the address has blowned
> > up?
> >
> > Say we address the NULL memcg case. What is the resulting behavior?
> > Doesn't userspace access a poisoned page and get a silend memory
> > corruption?
>
> + Yang Shi
>
> Check previous link[1], seems that it is a known issue, and there is a TODO
> list for storage backed filesystems from Yang.
OK, so IIUC this patch will just help the test to not blow up but it
will not allow the test to behave consistently. From my past experience
the hwpoisoning is not really something that any production environment
should be relying on working properly.
But this patch is straightforward so no objection from me.
Acked-by: Michal Hocko <mhocko@...e.com>
Thanks!
> [1] https://lore.kernel.org/all/20211020210755.23964-6-shy828301@gmail.com/T/#m1d40559ca2dcf94396df5369214288f69dec379b
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists