linux-kernel - Re: kernel panic on null pointer on page->mem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170821132345.GK25956@dhcp22.suse.cz>
Date:   Mon, 21 Aug 2017 15:23:45 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Brad Bolen <bradleybolen@...il.com>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: kernel panic on null pointer on page->mem_cgroup

On Mon 21-08-17 09:02:18, Johannes Weiner wrote:
> On Thu, Aug 10, 2017 at 01:56:05PM +0200, Michal Hocko wrote:
> > On Wed 09-08-17 14:38:25, Johannes Weiner wrote:
> > > The issue is that writeback doesn't hold a page reference and the page
> > > might get freed after PG_writeback is cleared (and the mapping is
> > > unlocked) in test_clear_page_writeback(). The stat functions looking
> > > up the page's node or zone are safe, as those attributes are static
> > > across allocation and free cycles. But page->mem_cgroup is not, and it
> > > will get cleared if we race with truncation or migration.
> > 
> > Is there anything that prevents us from holding a reference on a page
> > under writeback?
> 
> Hm, I'm hesitant to add redundant life-time management to the page
> there just for memcg, which is not always configured in.
> 
> Pinning the memcg instead is slightly more complex, but IMO has the
> complexity in a preferrable place.

If that is the single place that needs such a special handling and it is
very likely to stay that way then the additional complexity is probably
justified. I am just worried that this is really subtle and history
tells us that such a code usually kicks us back later.
 
> Would you agree?

Well, I was not objecting to the patch. It seems correct I am just
worried a robust fix would be preferable. And a clear object life time
sounds like a more robust thing to do.
-- 
Michal Hocko
SUSE Labs