[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKH8qBt5-p24p9AvuEntb=gRFsJ_UQZ_GX8mFsPZZPq7CgL_4A@mail.gmail.com>
Date: Wed, 27 Jul 2022 13:39:08 -0700
From: Stanislav Fomichev <sdf@...gle.com>
To: Martin KaFai Lau <kafai@...com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
David Miller <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, kernel-team@...com,
Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH bpf-next 02/14] bpf: net: Avoid sock_setsockopt() taking
sk lock when called from bpf
On Wed, Jul 27, 2022 at 11:37 AM Martin KaFai Lau <kafai@...com> wrote:
>
> On Wed, Jul 27, 2022 at 09:47:25AM -0700, sdf@...gle.com wrote:
> > On 07/26, Martin KaFai Lau wrote:
> > > Most of the codes in bpf_setsockopt(SOL_SOCKET) are duplicated from
> > > the sock_setsockopt(). The number of supported options are
> > > increasing ever and so as the duplicated codes.
> >
> > > One issue in reusing sock_setsockopt() is that the bpf prog
> > > has already acquired the sk lock. sockptr_t is useful to handle this.
> > > sockptr_t already has a bit 'is_kernel' to handle the kernel-or-user
> > > memory copy. This patch adds a 'is_bpf' bit to tell if sk locking
> > > has already been ensured by the bpf prog.
> >
> > Why not explicitly call it is_locked/is_unlocked? I'm assuming, at some
> > point,
> is_locked was my initial attempt. The bpf_setsockopt() also skips
> the ns_capable() check, like in patch 3. I ended up using
> one is_bpf bit here to do both.
Yeah, sorry, I haven't read the whole series before I sent my first
reply. Let's discuss it here.
This reminds me of ns_capable in __inet_bind where we also had to add
special handling.
In general, not specific to the series, I wonder if we want some new
in_bpf() context indication and bypass ns_capable() from those
contexts?
Then we can do things like:
if (sk->sk_bound_dev_if && !in_bpf() && !ns_capable(net->user_ns,
CAP_NET_RAW))
return ...;
Or would it make things more confusing?
> > we can have code paths in bpf where the socket has been already locked by
> > the stack?
> hmm... You meant the opposite, like the bpf hook does not have the
> lock pre-acquired before the bpf prog gets run and sock_setsockopt()
> should do lock_sock() as usual?
>
> I was thinking a likely situation is a bpf 'sleepable' hook does not
> have the lock pre-acquired. In that case, the bpf_setsockopt() could
> always acquire the lock first but it may turn out to be too
> pessmissitic for the future bpf_[G]etsockopt() refactoring.
>
> or we could do this 'bit' break up (into one is_locked bit
> for locked and one is_bpf to skip-capable-check). I was waiting until a real
> need comes up instead of having both bits always true now. I don't mind to
> add is_locked now since the bpf_lsm_cgroup may come to sleepable soon.
> I can do this in the next spin.
Powered by blists - more mailing lists