[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpUY_-itn2oRQ-CEtOxZ-u1sKxCfbYWgPS1v6NTkT-NbSA@mail.gmail.com>
Date: Wed, 5 Sep 2018 13:32:54 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Vlad Buslov <vladbu@...lanox.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Jamal Hadi Salim <jhs@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>,
David Miller <davem@...emloft.net>
Subject: Re: [PATCH net-next] net: sched: change tcf_del_walker() to use
concurrent-safe delete
On Wed, Sep 5, 2018 at 12:05 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>
>
> On Tue 04 Sep 2018 at 22:41, Cong Wang <xiyou.wangcong@...il.com> wrote:
> > On Mon, Sep 3, 2018 at 1:33 PM Vlad Buslov <vladbu@...lanox.com> wrote:
> >>
> >>
> >> On Mon 03 Sep 2018 at 18:50, Cong Wang <xiyou.wangcong@...il.com> wrote:
> >> > On Mon, Sep 3, 2018 at 12:06 AM Vlad Buslov <vladbu@...lanox.com> wrote:
> >> >>
> >> >> Action API was changed to work with actions and action_idr in concurrency
> >> >> safe manner, however tcf_del_walker() still uses actions without taking
> >> >> reference to them first and deletes them directly, disregarding possible
> >> >> concurrent delete.
> >> >>
> >> >> Change tcf_del_walker() to use tcf_idr_delete_index() that doesn't require
> >> >> caller to hold reference to action and accepts action id as argument,
> >> >> instead of direct action pointer.
> >> >
> >> > Hmm, why doesn't tcf_del_walker() just take idrinfo->lock? At least
> >> > tcf_dump_walker() already does.
> >>
> >> Because tcf_del_walker() calls __tcf_idr_release(), which take
> >> idrinfo->lock itself (deadlock). It also calls sleeping functions like
> >
> > Deadlock can be easily resolved by moving the lock out.
> >
> >
> >> tcf_action_goto_chain_fini(), so just implementing function that
> >> releases action without taking idrinfo->lock is not enough.
> >
> > Sleeping can be resolved either by making it atomic or
> > deferring it to a work queue.
> >
> > None of your arguments here is a blocker to locking
> > idrinfo->lock. You really should focus on if it is really
> > necessary to lock idrinfo->lock in tcf_del_walker(), rather
> > than these details.
> >
> > For me, if you need idrinfo->lock for dump walker, you must
> > need it for delete walker too, because deletion is a writer
> > which should require stronger protection than the dumper,
> > which merely a reader.
>
> I don't get how it is necessary. Dump walker uses pointers to actions
> directly, and in order to be concurrency-safe it must either hold the
It uses the pointer in a read-only way, what you said doesn't change
the fact that it is a reader. And, like other readers, it may not need
to lock at all, which is a different topic.
> lock or obtain reference to action. Note that del walker doesn't use the
> action pointer, it only passed action id to tcf_idr_delete_index()
> function, which does all the necessary locking and can deal with
> potential concurrency issues (concurrent delete, etc.). This approach
> also benefits from code reuse from other code paths that delete actions,
> instead of implementing its own.
Look at the difference below.
With your change:
idr_for_each_entry_ul{
spin_lock(&idrinfo->lock);
idr_remove();
spin_unlock(&idrinfo->lock);
}
With what I suggest:
spin_lock(&idrinfo->lock);
idr_for_each_entry_ul{
idr_remove();
}
spin_unlock(&idrinfo->lock);
Isn't a concurrent tcf_idr_check_alloc() able to livelock here with
your change?
idr_for_each_entry_ul{
spin_lock(&idrinfo->lock);
idr_remove();
spin_unlock(&idrinfo->lock);
// tcf_idr_check_alloc() jumps in,
// allocates next ID which can be found
// by idr_get_next_ul()
} // the whole loop goes _literately_ infinite...
Also, idr_for_each_entry_ul() is supposed to be protected either
by RCU or idrinfo->lock, no? With your change or without any change,
it doesn't even have any lock after removing RTNL?
Powered by blists - more mailing lists