netdev - Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAM_iQpVTwkxep3RCcwqCE0rypwj5prLdbE4oEUTyev+RxQq37Q@mail.gmail.com>
Date:   Mon, 22 Jun 2020 11:14:20 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Roman Gushchin <guro@...com>
Cc:     Zefan Li <lizefan@...wei.com>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Cameron Berkenpas <cam@...-zeon.de>,
        Peter Geis <pgwipeout@...il.com>,
        Lu Fengqi <lufq.fnst@...fujitsu.com>,
        Daniël Sonck <dsonck92@...il.com>,
        Daniel Borkmann <daniel@...earbox.net>,
        Tejun Heo <tj@...nel.org>
Subject: Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

On Sat, Jun 20, 2020 at 8:58 AM Roman Gushchin <guro@...com> wrote:
>
> On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote:
> > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin <guro@...com> wrote:
> > >
> > > On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote:
> > > > I think so, though I'm not familiar with the bfp cgroup code.
> > > >
> > > > > If so, we might wanna fix it in a different way,
> > > > > just checking if (!(css->flags & CSS_NO_REF)) in cgroup_bpf_put()
> > > > > like in cgroup_put(). It feels more reliable to me.
> > > > >
> > > >
> > > > Yeah I also have this idea in my mind.
> > >
> > > I wonder if the following patch will fix the issue?
> >
> > Interesting, AFAIU, this refcnt is for bpf programs attached
> > to the cgroup. By this suggestion, do you mean the root
> > cgroup does not need to refcnt the bpf programs attached
> > to it? This seems odd, as I don't see how root is different
> > from others in terms of bpf programs which can be attached
> > and detached in the same way.
> >
> > I certainly understand the root cgroup is never gone, but this
> > does not mean the bpf programs attached to it too.
> >
> > What am I missing?
>
> It's different because the root cgroup can't be deleted.
>
> All this reference counting is required to automatically detach bpf programs
> from a _deleted_ cgroup (look at cgroup_bpf_offline()). It's required
> because a cgroup can be in dying state for a long time being pinned by a
> pagecache page, for example. Only a user can detach a bpf program from
> an existing cgroup.

Yeah, but users can still detach the bpf programs from root cgroup.
IIUC, after detaching, the pointer in the bpf array will be empty_prog_array
which is just an array of NULL. Then __cgroup_bpf_run_filter_skb() will
deref it without checking NULL (as check_non_null == false).

This matches the 0000000000000010 pointer seen in the bug reports,
the 0x10, that is 16, is the offset of items[] in struct bpf_prog_array.
So looks like we have to add a NULL check there regardless of refcnt.

Also, I am not sure whether your suggested patch makes a difference
for percpu refcnt, as percpu_ref_put() will never call ->release() until
percpu_ref_kill(), which is never called on root cgroup?

Thanks.