netdev - Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200622203910.GE301338@carbon.dhcp.thefacebook.com>
Date:   Mon, 22 Jun 2020 13:39:10 -0700
From:   Roman Gushchin <guro@...com>
To:     Cong Wang <xiyou.wangcong@...il.com>
CC:     Zefan Li <lizefan@...wei.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Cameron Berkenpas <cam@...-zeon.de>,
        Peter Geis <pgwipeout@...il.com>,
        Lu Fengqi <lufq.fnst@...fujitsu.com>,
        Daniël Sonck <dsonck92@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

On Mon, Jun 22, 2020 at 11:14:20AM -0700, Cong Wang wrote:
> On Sat, Jun 20, 2020 at 8:58 AM Roman Gushchin <guro@...com> wrote:
> >
> > On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote:
> > > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin <guro@...com> wrote:
> > > >
> > > > On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote:
> > > > > I think so, though I'm not familiar with the bfp cgroup code.
> > > > >
> > > > > > If so, we might wanna fix it in a different way,
> > > > > > just checking if (!(css->flags & CSS_NO_REF)) in cgroup_bpf_put()
> > > > > > like in cgroup_put(). It feels more reliable to me.
> > > > > >
> > > > >
> > > > > Yeah I also have this idea in my mind.
> > > >
> > > > I wonder if the following patch will fix the issue?
> > >
> > > Interesting, AFAIU, this refcnt is for bpf programs attached
> > > to the cgroup. By this suggestion, do you mean the root
> > > cgroup does not need to refcnt the bpf programs attached
> > > to it? This seems odd, as I don't see how root is different
> > > from others in terms of bpf programs which can be attached
> > > and detached in the same way.
> > >
> > > I certainly understand the root cgroup is never gone, but this
> > > does not mean the bpf programs attached to it too.
> > >
> > > What am I missing?
> >
> > It's different because the root cgroup can't be deleted.
> >
> > All this reference counting is required to automatically detach bpf programs
> > from a _deleted_ cgroup (look at cgroup_bpf_offline()). It's required
> > because a cgroup can be in dying state for a long time being pinned by a
> > pagecache page, for example. Only a user can detach a bpf program from
> > an existing cgroup.
> 
> Yeah, but users can still detach the bpf programs from root cgroup.
> IIUC, after detaching, the pointer in the bpf array will be empty_prog_array
> which is just an array of NULL. Then __cgroup_bpf_run_filter_skb() will
> deref it without checking NULL (as check_non_null == false).
> 
> This matches the 0000000000000010 pointer seen in the bug reports,
> the 0x10, that is 16, is the offset of items[] in struct bpf_prog_array.
> So looks like we have to add a NULL check there regardless of refcnt.
> 
> Also, I am not sure whether your suggested patch makes a difference
> for percpu refcnt, as percpu_ref_put() will never call ->release() until
> percpu_ref_kill(), which is never called on root cgroup?

Hm, true. But it means that the problem is not with the root cgroup's bpf?

How easy is to reproduce the problem? Is it possible to bisect the problematic
commit?

Thanks!