netdev - Re: [bpf PATCH v2 2/8] bpf: sockmap, ensure sock lock held during tear down

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5e3c6355ab84c_22ad2af2cbd0a5b478@john-XPS-13-9370.notmuch>
Date:   Thu, 06 Feb 2020 11:04:53 -0800
From:   John Fastabend <john.fastabend@...il.com>
To:     Jakub Sitnicki <jakub@...udflare.com>,
        John Fastabend <john.fastabend@...il.com>
Cc:     bpf@...r.kernel.org, netdev@...r.kernel.org, daniel@...earbox.net,
        ast@...nel.org, song@...nel.org, jonathan.lemon@...il.com
Subject: Re: [bpf PATCH v2 2/8] bpf: sockmap, ensure sock lock held during
 tear down

Jakub Sitnicki wrote:
> On Thu, Feb 06, 2020 at 06:51 AM CET, John Fastabend wrote:
> > Jakub Sitnicki wrote:
> >> On Sat, Jan 11, 2020 at 07:12 AM CET, John Fastabend wrote:
> >> > The sock_map_free() and sock_hash_free() paths used to delete sockmap
> >> > and sockhash maps walk the maps and destroy psock and bpf state associated
> >> > with the socks in the map. When done the socks no longer have BPF programs
> >> > attached and will function normally. This can happen while the socks in
> >> > the map are still "live" meaning data may be sent/received during the walk.

[...]

> >>
> >> John, I've noticed this is triggering warnings that we might sleep in
> >> lock_sock while (1) in RCU read-side section, and (2) holding a spin
> >> lock:
> 
> [...]
> 
> >>
> >> Here's an idea how to change the locking. I'm still wrapping my head
> >> around what protects what in sock_map_free, so please bear with me:
> >>
> >> 1. synchronize_rcu before we iterate over the array is not needed,
> >>    AFAICT. We are not free'ing the map just yet, hence any readers
> >>    accessing the map via the psock are not in danger of use-after-free.
> >
> > Agreed. When we added 2bb90e5cc90e ("bpf: sockmap, synchronize_rcu before
> > free'ing map") we could have done this.
> >
> >>
> >> 2. rcu_read_lock is needed to protect access to psock inside
> >>    sock_map_unref, but we can't sleep while in RCU read-side.  So push
> >>    it down, after we grab the sock lock.
> >
> > yes this looks better.
> >
> >>
> >> 3. Grabbing stab->lock seems not needed, either. We get called from
> >>    bpf_map_free_deferred, after map refcnt dropped to 0, so we're not
> >>    racing with any other map user to modify its contents.
> >
> > This I'll need to think on a bit. We have the link-lock there so
> > probably should be safe to drop. But will need to trace this through
> > git history to be sure.
> >
> 
> [...]
> 
> >> WDYT?
> >
> > Can you push the fix to bpf but leave the stab->lock for now. I think
> > we can do a slightly better cleanup on stab->lock in bpf-next.
> 
> Here it is:
> 
> https://lore.kernel.org/bpf/20200206111652.694507-1-jakub@cloudflare.com/T/#t
> 
> I left the "extra" synchronize_rcu before walking the map. On second
> thought, this isn't a bug. Just adds extra wait. bpf-next material?

Agree.

> 
> >
> >>
> >> Reproducer follows.
> >
> > push reproducer into selftests?
> 
> Included the reproducer with the fixes. If it gets dropped from the
> series, I'll resubmit it once bpf-next reopens.

Yeah, I don't have a strong preference where it lands I have a set of
tests for bpf-next once it opens as well.