linux-kernel - Re: [PATCH v2] net: sched: fix memory leak in tcindex_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20221116132139.2940-1-yin31149@gmail.com>
Date:   Wed, 16 Nov 2022 21:21:38 +0800
From:   Hawkins Jiawei <yin31149@...il.com>
To:     yin31149@...il.com
Cc:     18801353760@....com, cong.wang@...edance.com, davem@...emloft.net,
        edumazet@...gle.com, jhs@...atatu.com, jiri@...nulli.us,
        kuba@...nel.org, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org, pabeni@...hat.com,
        syzbot+232ebdbd36706c965ebf@...kaller.appspotmail.com,
        syzkaller-bugs@...glegroups.com, xiyou.wangcong@...il.com
Subject: Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms

On Wed, 16 Nov 2022 at 20:10, Hawkins Jiawei <yin31149@...il.com> wrote:
>
> On Wed, 16 Nov 2022 at 10:44, Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Tue, 15 Nov 2022 19:57:10 +0100 Paolo Abeni wrote:
> > > This code confuses me more than a bit, and I don't follow ?!?
> >
> > It's very confusing :S
> >
> > For starters I don't know when r != old_r. I mean now it triggers
> > randomly after the RCU-ification, but in the original code when
> > it was just a memset(). When would old_r ever not be null and yet
> > point to a different entry?
>
> I am also confused about the code when I tried to fix this bug.
>
> As for when `old_r != r`, according to the simplified
> code below, this should be probably true if `p->perfect` is true
> or `!p->perfect && !pc->h` is true(please correct me if I am wrong)
>
>         struct tcindex_filter_result new_filter_result, *old_r = r;
>         struct tcindex_data *cp = NULL, *oldp;
>         struct tcf_result cr = {};
>
>         /* tcindex_data attributes must look atomic to classifier/lookup so
>          * allocate new tcindex data and RCU assign it onto root. Keeping
>          * perfect hash and hash pointers from old data.
>          */
>         cp = kzalloc(sizeof(*cp), GFP_KERNEL);
>
>         if (p->perfect) {
>                 if (tcindex_alloc_perfect_hash(net, cp) < 0)
>                         goto errout;
>                 ...
>         }
>         cp->h = p->h;
>
>         if (!cp->perfect && !cp->h) {
>                 if (valid_perfect_hash(cp)) {
>                         if (tcindex_alloc_perfect_hash(net, cp) < 0)
>                                 goto errout_alloc;
>
>                 } else {
>                         struct tcindex_filter __rcu **hash;
>
>                         hash = kcalloc(cp->hash,
>                                        sizeof(struct tcindex_filter *),
>                                        GFP_KERNEL);
>
>                         if (!hash)
>                                 goto errout_alloc;
>
>                         cp->h = hash;
>                 }
>         }
>         ...
>
>         if (cp->perfect)
>                 r = cp->perfect + handle;
>         else
>                 r = tcindex_lookup(cp, handle) ? : &new_filter_result;
>
>         if (old_r && old_r != r) {
>                 err = tcindex_filter_result_init(old_r, cp, net);
>                 if (err < 0) {
>                         kfree(f);
>                         goto errout_alloc;
>                 }
>         }
>
> * If `p->perfect` is true, tcindex_alloc_perfect_hash() newly
> alloctes cp->perfect.
>
> * If `!p->perfect && !p->h` is true, cp->perfect or cp->h is
> newly allocated.
>
> In either case, r probably points to the newly allocated memory,
> which should not equals to the old_r.

Sorry for the error. In the second case, `r` is possibly
pointing to the `&new_filter_result`, which is a stack variable
address, and should still not equal to the `old_r`.

>
> >
> > > it looks like that at this point:
> > >
> > > * the data path could access 'old_r->exts' contents via 'p' just before
> > > the previous 'tcindex_filter_result_init(old_r, cp, net);' but still
> > > potentially within the same RCU grace period
> > >
> > > * 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old
> > > exts from 'p'  so that will not be freed by later
> > > tcindex_partial_destroy_work()
> > >
> > > Overall it looks to me that we need some somewhat wait for the RCU
> > > grace period,
> >
> > Isn't it better to make @cp a deeper copy of @p ?
> > I thought it already is but we don't seem to be cloning p->h.
> > Also the cloning of p->perfect looks quite lossy.
>
> Yes, I also think @cp should be a deeper copy of @p.
>
> But it seems that in tcindex_alloc_perfect_hash(),
> each @cp ->exts will be initialized by tcf_exts_init()
> as below, and tcindex_set_parms() forgets to free the
> old ->exts content, triggering this memory leak.(Please
> correct me if I am wrong)
>
>         static int tcindex_alloc_perfect_hash(struct net *net,
>                                               struct tcindex_data *cp)
>         {
>                 int i, err = 0;
>
>                 cp->perfect = kcalloc(cp->hash, sizeof(struct tcindex_filter_result),
>                                       GFP_KERNEL | __GFP_NOWARN);
>
>                 for (i = 0; i < cp->hash; i++) {
>                         err = tcf_exts_init(&cp->perfect[i].exts, net,
>                                             TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
>                         if (err < 0)
>                                 goto errout;
>                         cp->perfect[i].p = cp;
>                 }
>         }
>
>         static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net,
>                                         int action, int police)
>         {
>         #ifdef CONFIG_NET_CLS_ACT
>                 exts->type = 0;
>                 exts->nr_actions = 0;
>                 /* Note: we do not own yet a reference on net.
>                  * This reference might be taken later from tcf_exts_get_net().
>                  */
>                 exts->net = net;
>                 exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
>                                         GFP_KERNEL);
>                 if (!exts->actions)
>                         return -ENOMEM;
>         #endif
>                 exts->action = action;
>                 exts->police = police;
>                 return 0;
>         }
>
> >
> > > Somewhat side question: it looks like that the 'perfect hashing' usage
> > > is the root cause of the issue addressed here, and very likely is
> > > afflicted by other problems, e.g. the data curruption in 'err =
> > > tcindex_filter_result_init(old_r, cp, net);'.
> > >
> > > AFAICS 'perfect hashing' usage is a sort of optimization that the user-
> > > space may trigger with some combination of the tcindex arguments. I'm
> > > wondering if we could drop all perfect hashing related code?
> >
> > The thought of "how much of this can we delete" did cross my mind :)