netdev - Re: [PATCH bpf-next 02/14] bpf: net: Avoid sock_setsockopt() taking sk lock when called from bpf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20220727183700.iczavo77o6ubxbwm@kafai-mbp.dhcp.thefacebook.com>
Date:   Wed, 27 Jul 2022 11:37:00 -0700
From:   Martin KaFai Lau <kafai@...com>
To:     sdf@...gle.com
Cc:     bpf@...r.kernel.org, netdev@...r.kernel.org,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        David Miller <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>, kernel-team@...com,
        Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH bpf-next 02/14] bpf: net: Avoid sock_setsockopt() taking
 sk lock when called from bpf

On Wed, Jul 27, 2022 at 09:47:25AM -0700, sdf@...gle.com wrote:
> On 07/26, Martin KaFai Lau wrote:
> > Most of the codes in bpf_setsockopt(SOL_SOCKET) are duplicated from
> > the sock_setsockopt().  The number of supported options are
> > increasing ever and so as the duplicated codes.
> 
> > One issue in reusing sock_setsockopt() is that the bpf prog
> > has already acquired the sk lock.  sockptr_t is useful to handle this.
> > sockptr_t already has a bit 'is_kernel' to handle the kernel-or-user
> > memory copy.  This patch adds a 'is_bpf' bit to tell if sk locking
> > has already been ensured by the bpf prog.
> 
> Why not explicitly call it is_locked/is_unlocked? I'm assuming, at some
> point,
is_locked was my initial attempt.  The bpf_setsockopt() also skips
the ns_capable() check, like in patch 3.  I ended up using
one is_bpf bit here to do both.

> we can have code paths in bpf where the socket has been already locked by
> the stack?
hmm... You meant the opposite, like the bpf hook does not have the
lock pre-acquired before the bpf prog gets run and sock_setsockopt()
should do lock_sock() as usual?

I was thinking a likely situation is a bpf 'sleepable' hook does not
have the lock pre-acquired.  In that case, the bpf_setsockopt() could
always acquire the lock first but it may turn out to be too
pessmissitic for the future bpf_[G]etsockopt() refactoring.

or we could do this 'bit' break up (into one is_locked bit
for locked and one is_bpf to skip-capable-check).  I was waiting until a real
need comes up instead of having both bits always true now.  I don't mind to
add is_locked now since the bpf_lsm_cgroup may come to sleepable soon.
I can do this in the next spin.