linux-kernel - Re: [PATCH 7/7] RCU, workqueue: Implement rcu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180307145408.GC3918@linux.vnet.ibm.com>
Date:   Wed, 7 Mar 2018 06:54:08 -0800
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Lai Jiangshan <jiangshanlai+lkml@...il.com>
Cc:     Tejun Heo <tj@...nel.org>, torvalds@...ux-foundation.org,
        jannh@...gle.com, bcrl@...ck.org, viro@...iv.linux.org.uk,
        kent.overstreet@...il.com, security@...nel.org,
        LKML <linux-kernel@...r.kernel.org>, kernel-team@...com
Subject: Re: [PATCH 7/7] RCU, workqueue: Implement rcu_work

On Wed, Mar 07, 2018 at 10:49:49AM +0800, Lai Jiangshan wrote:
> On Wed, Mar 7, 2018 at 1:33 AM, Tejun Heo <tj@...nel.org> wrote:
> 
> > +/**
> > + * queue_rcu_work_on - queue work on specific CPU after a RCU grace period
> > + * @cpu: CPU number to execute work on
> > + * @wq: workqueue to use
> > + * @rwork: work to queue
> 
> For many people, "RCU grace period" is clear enough, but not ALL.
> 
> So please make it a little more clear that it just queues work after
> a *Normal* RCU grace period. it supports only one RCU variant.
> 
> 
> > + *
> > + * Return: %false if @work was already on a queue, %true otherwise.
> > + */
> 
> I'm afraid this will be a hard-using API.
> 
> The user can't find a plan B when it returns false, especially when
> the user expects the work must be called at least once again
> after an RCU grace period.
> 
> And the error-prone part of it is that, like other queue_work() functions,
> the return value of it is often ignored and makes the problem worse.
> 
> So, since workqueue.c provides this API, it should handle this
> problem. For example, by calling call_rcu() again in this case, but
> everything will be much more complex: a synchronization is needed
> for "calling call_rcu() again" and allowing the work item called
> twice after the last queue_rcu_work() is not workqueue style.

I confess that I had not thought of this aspect, but doesn't the
rcu_barrier() in v2 of this patchset guarantee that it has passed
the RCU portion of the overall wait time?  Given that, what I am
missing is now this differs from flush_work() on a normal work item.

So could you please show me the sequence of events that leads to this
problem?  (Again, against v2 of this patch, which replaces the
synchronize_rcu() with rcu_barrier().)

> Some would argue that the delayed_work has the same problem
> when the user expects the work must be called at least once again
> after a period of time. But time interval is easy to detect, the user
> can check the time and call the queue_delayed_work() again
> when needed which is also a frequent design pattern. And
> for rcu, it is hard to use this design pattern since it is hard
> to detect (new) rcu grace period without using call_rcu().
> 
> I would not provide this API. it is not a NACK. I'm just trying
> expressing my thinking about the API. I'd rather RCU be changed
> and RCU callbacks are changed to be sleepable. But a complete
> overhaul cleanup on the whole source tree for compatibility
> is needed at first, an even more complex job.

One downside of allowing RCU callback functions to sleep is that
one poorly written callback can block a bunch of other ones.
One advantage of Tejun's approach is that such delays only affect
the workqueues, which are already designed to handle such delays.

						Thanx, Paul

> > +bool queue_rcu_work_on(int cpu, struct workqueue_struct *wq,
> > +                      struct rcu_work *rwork)
> > +{
> > +       struct work_struct *work = &rwork->work;
> > +
> > +       if (!test_and_set_bit(WORK_STRUCT_PENDING_BIT, work_data_bits(work))) {
> > +               rwork->wq = wq;
> > +               rwork->cpu = cpu;
> > +               call_rcu(&rwork->rcu, rcu_work_rcufn);
> > +               return true;
> > +       }
> > +
> > +       return false;
> > +}
> > +EXPORT_SYMBOL(queue_rcu_work_on);
> > +
>