linux-kernel - Re: [PATCH] Remove GP_REPLAY state from rcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191004163732.GA253167@google.com>
Date:   Fri, 4 Oct 2019 12:37:32 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     linux-kernel@...r.kernel.org, bristot@...hat.com,
        peterz@...radead.org, paulmck@...nel.org, rcu@...r.kernel.org,
        Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Paul E. McKenney" <paulmck@...ux.ibm.com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH] Remove GP_REPLAY state from rcu_sync

Hi Oleg,

On Fri, Oct 04, 2019 at 05:41:03PM +0200, Oleg Nesterov wrote:
> On 10/04, Joel Fernandes (Google) wrote:
> >
> > But this is not always true if you consider the following events:
> 
> I'm afraid I missed your point, but...
> 
> > ---------------------->
> > GP num         111111     22222222222222222222222222222222233333333
> > GP state  i    e     p    x                 r              rx     i
> > CPU0 :         rse	  rsx
> > CPU1 :                         rse     rsx
> > CPU2 :                                         rse     rsx
> >
> > Here, we had 3 grace periods that elapsed, 1 for the rcu_sync_enter(),
> > and 2 for the rcu_sync_exit(s).
> 
> But this is fine?
> 
> We only need to ensure that we have a full GP pass between the "last"
> rcu_sync_exit() and GP_XXX -> GP_IDLE transition.
> 
> > However, we had 3 rcu_sync_exit()s, not 2. In other words, the
> > rcu_sync_exit() got batched.
> >
> > So my point here is, rcu_sync_exit() does not really always cause a new
> > GP to happen
> 
> See above, it should not.

Ok, I understand now. The point is to wait for a full GP, not necessarily
start a new one on each exit.

> > Then what is the point of the GP_REPLAY state at all if it does not
> > always wait for a new GP?
> 
> Again, I don't understand... GP_REPLAY ensures that we will have a full GP
> before rcu_sync_func() sets GP_IDLE, note that it does another "recursive"
> call_rcu() if it sees GP_REPLAY.

Ok, got it.

> > Taking a step back, why did we intend to have
> > to wait for a new GP if another rcu_sync_exit() comes while one is still
> > in progress?
> 
> To ensure that if another CPU sees rcu_sync_is_idle() (GP_IDLE) after you
> do rcu_sync_exit(), then it must also see all memory changes you did before
> rcu_sync_exit().

Would this not be better implemented using memory barriers, than starting new
grace periods just for memory ordering? A memory barrier is lighter than
having to go through a grace period. So something like: if the state is
already GP_EXIT, then rcu_sync_exit() issues a memory barrier instead of
replaying. But if state is GP_PASSED, then wait for a grace period. Or, do
you see a situation where this will not work?

thanks,

 - Joel