[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180928181114.GA28797@splinter>
Date: Fri, 28 Sep 2018 21:11:14 +0300
From: Ido Schimmel <idosch@...sch.org>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Jiri Pirko <jiri@...nulli.us>,
Jamal Hadi Salim <jhs@...atatu.com>,
Vlad Buslov <vladbu@...lanox.com>
Subject: Re: [Patch net-next v3] net_sched: change tcf_del_walker() to take
idrinfo->lock
On Fri, Sep 28, 2018 at 10:56:47AM -0700, Cong Wang wrote:
> On Fri, Sep 28, 2018 at 7:59 AM Ido Schimmel <idosch@...sch.org> wrote:
> >
> > On Wed, Sep 19, 2018 at 04:37:29PM -0700, Cong Wang wrote:
> > > From: Vlad Buslov <vladbu@...lanox.com>
> > >
> > > From: Vlad Buslov <vladbu@...lanox.com>
> > >
> > > Action API was changed to work with actions and action_idr in concurrency
> > > safe manner, however tcf_del_walker() still uses actions without taking a
> > > reference or idrinfo->lock first, and deletes them directly, disregarding
> > > possible concurrent delete.
> > >
> > > Change tcf_del_walker() to take idrinfo->lock while iterating over actions
> > > and use new tcf_idr_release_unsafe() to release them while holding the
> > > lock.
> > >
> > > And the blocking function fl_hw_destroy_tmplt() could be called when we
> > > put a filter chain, so defer it to a work queue.
> >
> > I'm getting a use-after-free when running tc_chains.sh selftest and I
> > believe it's caused by this patch.
> >
> > To reproduce:
> > # cd tools/testing/selftests/net/forwarding
> > # export TESTS="template_filter_fits"; ./tc_chains.sh veth0 veth1
> >
> > __tcf_chain_put()
> > tc_chain_tmplt_del()
> > fl_tmplt_destroy()
> > tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work)
> > tcf_chain_destroy()
> > kfree(chain)
> >
> > Some time later fl_tmplt_destroy_work() starts executing and
> > dereferencing 'chain'.
>
> Oops, forgot to hold the chain... I will test this:
>
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 92dd5071a708..cbb68d5515d6 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -1444,6 +1444,7 @@ static void fl_tmplt_destroy_work(struct
> work_struct *work)
> struct fl_flow_tmplt, rwork);
>
> fl_hw_destroy_tmplt(tmplt->chain, tmplt);
> + tcf_chain_put(tmplt->chain);
> kfree(tmplt);
> }
>
> @@ -1451,6 +1452,7 @@ static void fl_tmplt_destroy(void *tmplt_priv)
> {
> struct fl_flow_tmplt *tmplt = tmplt_priv;
>
> + tcf_chain_hold(tmplt->chain);
> tcf_queue_work(&tmplt->rwork, fl_tmplt_destroy_work);
> }
I don't think this will work given the reference count already dropped
to 0, which is why the template deletion function was invoked. I didn't
test the patch, but I don't see what would prevent the chain from being
freed.
Thanks for looking into this.
Powered by blists - more mailing lists