lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 16 Nov 2022 20:10:10 +0800 From: Hawkins Jiawei <yin31149@...il.com> To: kuba@...nel.org Cc: 18801353760@....com, cong.wang@...edance.com, davem@...emloft.net, edumazet@...gle.com, jhs@...atatu.com, jiri@...nulli.us, linux-kernel@...r.kernel.org, netdev@...r.kernel.org, pabeni@...hat.com, syzbot+232ebdbd36706c965ebf@...kaller.appspotmail.com, syzkaller-bugs@...glegroups.com, xiyou.wangcong@...il.com, yin31149@...il.com Subject: Re: [PATCH v2] net: sched: fix memory leak in tcindex_set_parms On Wed, 16 Nov 2022 at 10:44, Jakub Kicinski <kuba@...nel.org> wrote: > > On Tue, 15 Nov 2022 19:57:10 +0100 Paolo Abeni wrote: > > This code confuses me more than a bit, and I don't follow ?!? > > It's very confusing :S > > For starters I don't know when r != old_r. I mean now it triggers > randomly after the RCU-ification, but in the original code when > it was just a memset(). When would old_r ever not be null and yet > point to a different entry? I am also confused about the code when I tried to fix this bug. As for when `old_r != r`, according to the simplified code below, this should be probably true if `p->perfect` is true or `!p->perfect && !pc->h` is true(please correct me if I am wrong) struct tcindex_filter_result new_filter_result, *old_r = r; struct tcindex_data *cp = NULL, *oldp; struct tcf_result cr = {}; /* tcindex_data attributes must look atomic to classifier/lookup so * allocate new tcindex data and RCU assign it onto root. Keeping * perfect hash and hash pointers from old data. */ cp = kzalloc(sizeof(*cp), GFP_KERNEL); if (p->perfect) { if (tcindex_alloc_perfect_hash(net, cp) < 0) goto errout; ... } cp->h = p->h; if (!cp->perfect && !cp->h) { if (valid_perfect_hash(cp)) { if (tcindex_alloc_perfect_hash(net, cp) < 0) goto errout_alloc; } else { struct tcindex_filter __rcu **hash; hash = kcalloc(cp->hash, sizeof(struct tcindex_filter *), GFP_KERNEL); if (!hash) goto errout_alloc; cp->h = hash; } } ... if (cp->perfect) r = cp->perfect + handle; else r = tcindex_lookup(cp, handle) ? : &new_filter_result; if (old_r && old_r != r) { err = tcindex_filter_result_init(old_r, cp, net); if (err < 0) { kfree(f); goto errout_alloc; } } * If `p->perfect` is true, tcindex_alloc_perfect_hash() newly alloctes cp->perfect. * If `!p->perfect && !p->h` is true, cp->perfect or cp->h is newly allocated. In either case, r probably points to the newly allocated memory, which should not equals to the old_r. > > > it looks like that at this point: > > > > * the data path could access 'old_r->exts' contents via 'p' just before > > the previous 'tcindex_filter_result_init(old_r, cp, net);' but still > > potentially within the same RCU grace period > > > > * 'tcindex_filter_result_init(old_r, cp, net);' has 'unlinked' the old > > exts from 'p' so that will not be freed by later > > tcindex_partial_destroy_work() > > > > Overall it looks to me that we need some somewhat wait for the RCU > > grace period, > > Isn't it better to make @cp a deeper copy of @p ? > I thought it already is but we don't seem to be cloning p->h. > Also the cloning of p->perfect looks quite lossy. Yes, I also think @cp should be a deeper copy of @p. But it seems that in tcindex_alloc_perfect_hash(), each @cp ->exts will be initialized by tcf_exts_init() as below, and tcindex_set_parms() forgets to free the old ->exts content, triggering this memory leak.(Please correct me if I am wrong) static int tcindex_alloc_perfect_hash(struct net *net, struct tcindex_data *cp) { int i, err = 0; cp->perfect = kcalloc(cp->hash, sizeof(struct tcindex_filter_result), GFP_KERNEL | __GFP_NOWARN); for (i = 0; i < cp->hash; i++) { err = tcf_exts_init(&cp->perfect[i].exts, net, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE); if (err < 0) goto errout; cp->perfect[i].p = cp; } } static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net, int action, int police) { #ifdef CONFIG_NET_CLS_ACT exts->type = 0; exts->nr_actions = 0; /* Note: we do not own yet a reference on net. * This reference might be taken later from tcf_exts_get_net(). */ exts->net = net; exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *), GFP_KERNEL); if (!exts->actions) return -ENOMEM; #endif exts->action = action; exts->police = police; return 0; } > > > Somewhat side question: it looks like that the 'perfect hashing' usage > > is the root cause of the issue addressed here, and very likely is > > afflicted by other problems, e.g. the data curruption in 'err = > > tcindex_filter_result_init(old_r, cp, net);'. > > > > AFAICS 'perfect hashing' usage is a sort of optimization that the user- > > space may trigger with some combination of the tcindex arguments. I'm > > wondering if we could drop all perfect hashing related code? > > The thought of "how much of this can we delete" did cross my mind :)
Powered by blists - more mailing lists