[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20150827190138.GG4029@linux.vnet.ibm.com>
Date: Thu, 27 Aug 2015 12:01:38 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: Hugh Dickins <hughd@...gle.com>, Vlastimil Babka <vbabka@...e.cz>,
"Kirill A. Shutemov" <kirill@...temov.name>,
Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Dave Hansen <dave.hansen@...el.com>,
Johannes Weiner <hannes@...xchg.org>,
David Rientjes <rientjes@...gle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Christoph Lameter <cl@...ux.com>
Subject: Re: [PATCHv3 4/5] mm: make compound_head() robust
On Thu, Aug 27, 2015 at 08:14:35PM +0200, Michal Hocko wrote:
> On Thu 27-08-15 09:36:34, Paul E. McKenney wrote:
> > On Thu, Aug 27, 2015 at 05:09:17PM +0200, Michal Hocko wrote:
> > > On Wed 26-08-15 14:29:16, Paul E. McKenney wrote:
> > > > On Wed, Aug 26, 2015 at 11:18:45AM -0700, Hugh Dickins wrote:
> > > [...]
> > > > > But if you do one day implement that, wouldn't sl?b.c have to use
> > > > > call_rcu_with_added_meaning() instead of call_rcu(), to be in danger
> > > > > of getting that bit set? (No rcu_head is placed in a PageTail page.)
> > > >
> > > > Good point, call_rcu_lazy(), but yes.
> > > >
> > > > > So although it might be a little strange not to use a variant intended
> > > > > for freeing memory when indeed that's what it's doing, it would not be
> > > > > the end of the world for SLAB_DESTROY_BY_RCU to carry on using straight
> > > > > call_rcu(), in defence of the struct page safety Kirill is proposing.
> > > >
> > > > As long as you are OK with the bottom bit being zero throughout the RCU
> > > > processing, yes.
> > >
> > > I am really not sure I udnerstand. What will prevent
> > > call_rcu(&page->rcu_head, free_page_rcu) done in a random driver?
> >
> > As long as it uses call_rcu(), call_rcu_bh(), call_rcu_sched(),
> > or call_srcu() and not some future call_rcu_lazy(), no problem.
> >
> > But yes, if you are going to assume that RCU leaves the bottom
> > bit of the rcu_head structure's ->next field zero, then everything
> > everywhere in the kernel might in the future need to be careful of
> > exactly what variant of call_rcu() is used.
>
> OK, so it would be call_rcu_$special to use the bit. This wasn't entirely
> clear to me. I thought it would be opposite.
Yes. And I cannot resist adding that the need to avoid
call_rcu_$special() would be with respect to a given rcu_head structure,
not global. Though I believe that you already figured that out. ;-)
> > > Cannot the RCU simply claim bit1? I can see 1146edcbef37 ("rcu: Loosen
> > > __call_rcu()'s rcu_head alignment constraint") but AFAIU all it would
> > > take to fix this would be to require struct rcu_head to be aligned to
> > > 32b no?
> >
> > There are some architectures that guarantee only 16-bit alignment.
> > If those architectures are fixed to do 32-bit alignment, or if support
> > for them is dropped, then the future restrictions mentioned above could
> > be dropped.
>
> My understanding of the discussion which led to the above patch is that
> m68k allows for 32b alignment you just have to be explicit about that
> (http://thread.gmane.org/gmane.linux.ports.m68k/5932/focus=5960). Which
> other archs would be affected?
>
> I mean, this patch allows for quite some simplification in the mm code.
> And I think that RCU can live with mm of the low bits without any
> issues. You've said that one bit should be sufficient for the RCU use
> case. So having 2 bits sounds like a good thing.
As long as MM doesn't use call_rcu_$special() for the rcu_head structure
in question, as long as MM is OK with the bottom bit of ->next always
being zero during a grace period, and as long as MM avoids writing
to ->next during a grace period, we should be good as is, even if a
call_rcu_$special() becomes necessary.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists