[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc4616002932b25973533c39c07f48ea57afa3dc.camel@redhat.com>
Date: Tue, 15 Nov 2022 19:57:10 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Jakub Kicinski <kuba@...nel.org>,
Hawkins Jiawei <yin31149@...il.com>
Cc: Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, 18801353760@....com,
syzbot+232ebdbd36706c965ebf@...kaller.appspotmail.com,
syzkaller-bugs@...glegroups.com,
Cong Wang <cong.wang@...edance.com>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms
On Tue, 2022-11-15 at 09:02 -0800, Jakub Kicinski wrote:
> On Mon, 14 Nov 2022 01:05:08 +0800 Hawkins Jiawei wrote:
>
> > @@ -479,6 +480,7 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
> > }
> >
> > if (old_r && old_r != r) {
> > + old_e = old_r->exts;
> > err = tcindex_filter_result_init(old_r, cp, net);
> > if (err < 0) {
> > kfree(f);
> > @@ -510,6 +512,12 @@ tcindex_set_parms(struct net *net, struct tcf_proto *tp, unsigned long base,
> > tcf_exts_destroy(&new_filter_result.exts);
> > }
> >
> > + /* Note: old_e should be destroyed after the RCU grace period,
> > + * to avoid possible use-after-free by concurrent readers.
> > + */
> > + synchronize_rcu();
> > + tcf_exts_destroy(&old_e);
>
> I don't think this dance is required, @cp is a copy of the original
> data, and the original (@p) is destroyed in a safe manner below.
This code confuses me more than a bit, and I don't follow ?!? it looks
like that at this point:
* the data path could access 'old_r->exts' contents via 'p' just before
the previous 'tcindex_filter_result_init(old_r, cp, net);' but still
potentially within the same RCU grace period
* 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old
exts from 'p' so that will not be freed by later
tcindex_partial_destroy_work()
Overall it looks to me that we need some somewhat wait for the RCU
grace period,
Somewhat side question: it looks like that the 'perfect hashing' usage
is the root cause of the issue addressed here, and very likely is
afflicted by other problems, e.g. the data curruption in 'err =
tcindex_filter_result_init(old_r, cp, net);'.
AFAICS 'perfect hashing' usage is a sort of optimization that the user-
space may trigger with some combination of the tcindex arguments. I'm
wondering if we could drop all perfect hashing related code?
Paolo
Powered by blists - more mailing lists