linux-kernel - Re: [PATCH v2] net: sched: fix memory leak in tcindex_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221116121010.101577-1-yin31149@gmail.com>
Date:   Wed, 16 Nov 2022 20:10:10 +0800
From:   Hawkins Jiawei <yin31149@...il.com>
To:     kuba@...nel.org
Cc:     18801353760@....com, cong.wang@...edance.com, davem@...emloft.net,
        edumazet@...gle.com, jhs@...atatu.com, jiri@...nulli.us,
        linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
        pabeni@...hat.com,
        syzbot+232ebdbd36706c965ebf@...kaller.appspotmail.com,
        syzkaller-bugs@...glegroups.com, xiyou.wangcong@...il.com,
        yin31149@...il.com
Subject: Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms

On Wed, 16 Nov 2022 at 10:44, Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Tue, 15 Nov 2022 19:57:10 +0100 Paolo Abeni wrote:
> > This code confuses me more than a bit, and I don't follow ?!?
>
> It's very confusing :S
>
> For starters I don't know when r != old_r. I mean now it triggers
> randomly after the RCU-ification, but in the original code when
> it was just a memset(). When would old_r ever not be null and yet
> point to a different entry?

I am also confused about the code when I tried to fix this bug.

As for when `old_r != r`, according to the simplified
code below, this should be probably true if `p->perfect` is true
or `!p->perfect && !pc->h` is true(please correct me if I am wrong)

        struct tcindex_filter_result new_filter_result, *old_r = r;
        struct tcindex_data *cp = NULL, *oldp;
        struct tcf_result cr = {};

        /* tcindex_data attributes must look atomic to classifier/lookup so
         * allocate new tcindex data and RCU assign it onto root. Keeping
         * perfect hash and hash pointers from old data.
         */
        cp = kzalloc(sizeof(*cp), GFP_KERNEL);

        if (p->perfect) {
                if (tcindex_alloc_perfect_hash(net, cp) < 0)
                        goto errout;
                ...
        }
        cp->h = p->h;

        if (!cp->perfect && !cp->h) {
                if (valid_perfect_hash(cp)) {
                        if (tcindex_alloc_perfect_hash(net, cp) < 0)
                                goto errout_alloc;

                } else {
                        struct tcindex_filter __rcu **hash;

                        hash = kcalloc(cp->hash,
                                       sizeof(struct tcindex_filter *),
                                       GFP_KERNEL);

                        if (!hash)
                                goto errout_alloc;

                        cp->h = hash;
                }
        }
        ...

        if (cp->perfect)
                r = cp->perfect + handle;
        else
                r = tcindex_lookup(cp, handle) ? : &new_filter_result;

        if (old_r && old_r != r) {
                err = tcindex_filter_result_init(old_r, cp, net);
                if (err < 0) {
                        kfree(f);
                        goto errout_alloc;
                }
        }

* If `p->perfect` is true, tcindex_alloc_perfect_hash() newly
alloctes cp->perfect.

* If `!p->perfect && !p->h` is true, cp->perfect or cp->h is
newly allocated.

In either case, r probably points to the newly allocated memory,
which should not equals to the old_r.

>
> > it looks like that at this point:
> >
> > * the data path could access 'old_r->exts' contents via 'p' just before
> > the previous 'tcindex_filter_result_init(old_r, cp, net);' but still
> > potentially within the same RCU grace period
> >
> > * 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old
> > exts from 'p'  so that will not be freed by later
> > tcindex_partial_destroy_work()
> >
> > Overall it looks to me that we need some somewhat wait for the RCU
> > grace period,
>
> Isn't it better to make @cp a deeper copy of @p ?
> I thought it already is but we don't seem to be cloning p->h.
> Also the cloning of p->perfect looks quite lossy.

Yes, I also think @cp should be a deeper copy of @p.

But it seems that in tcindex_alloc_perfect_hash(),
each @cp ->exts will be initialized by tcf_exts_init()
as below, and tcindex_set_parms() forgets to free the
old ->exts content, triggering this memory leak.(Please
correct me if I am wrong)

        static int tcindex_alloc_perfect_hash(struct net *net,
                                              struct tcindex_data *cp)
        {
        	int i, err = 0;
        
        	cp->perfect = kcalloc(cp->hash, sizeof(struct tcindex_filter_result),
        			      GFP_KERNEL | __GFP_NOWARN);
        
        	for (i = 0; i < cp->hash; i++) {
        		err = tcf_exts_init(&cp->perfect[i].exts, net,
        				    TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
        		if (err < 0)
        			goto errout;
        		cp->perfect[i].p = cp;
        	}
        }

        static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net,
        				int action, int police)
        {
        #ifdef CONFIG_NET_CLS_ACT
        	exts->type = 0;
        	exts->nr_actions = 0;
        	/* Note: we do not own yet a reference on net.
        	 * This reference might be taken later from tcf_exts_get_net().
        	 */
        	exts->net = net;
        	exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
        				GFP_KERNEL);
        	if (!exts->actions)
        		return -ENOMEM;
        #endif
        	exts->action = action;
        	exts->police = police;
        	return 0;
        }

>
> > Somewhat side question: it looks like that the 'perfect hashing' usage
> > is the root cause of the issue addressed here, and very likely is
> > afflicted by other problems, e.g. the data curruption in 'err =
> > tcindex_filter_result_init(old_r, cp, net);'.
> >
> > AFAICS 'perfect hashing' usage is a sort of optimization that the user-
> > space may trigger with some combination of the tcindex arguments. I'm
> > wondering if we could drop all perfect hashing related code?
>
> The thought of "how much of this can we delete" did cross my mind :)