lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0288caf4-3c9b-4eae-a2b4-f8934badc270@linux.dev>
Date: Tue, 24 Sep 2024 16:58:19 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Tiago Lam <tiagolam@...udflare.com>
Cc: "David S. Miller" <davem@...emloft.net>, David Ahern
 <dsahern@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Willem de Bruijn <willemdebruijn.kernel@...il.com>,
 Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
 Andrii Nakryiko <andrii@...nel.org>, Eduard Zingerman <eddyz87@...il.com>,
 Song Liu <song@...nel.org>, Yonghong Song <yonghong.song@...ux.dev>,
 John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
 Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
 Jiri Olsa <jolsa@...nel.org>, Mykola Lysenko <mykolal@...com>,
 Shuah Khan <shuah@...nel.org>, netdev@...r.kernel.org,
 linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
 linux-kselftest@...r.kernel.org, Jakub Sitnicki <jakub@...udflare.com>,
 kernel-team@...udflare.com
Subject: Re: [RFC PATCH 2/3] ipv6: Run a reverse sk_lookup on sendmsg.

On 9/17/24 6:15 PM, Tiago Lam wrote:
> On Fri, Sep 13, 2024 at 11:24:09AM -0700, Martin KaFai Lau wrote:
>> On 9/13/24 2:39 AM, Tiago Lam wrote:
>>> This follows the same rationale provided for the ipv4 counterpart, where
>>> it now runs a reverse socket lookup when source addresses and/or ports
>>> are changed, on sendmsg, to check whether egress traffic should be
>>> allowed to go through or not.
>>>
>>> As with ipv4, the ipv6 sendmsg path is also extended here to support the
>>> IPV6_ORIGDSTADDR ancilliary message to be able to specify a source
>>> address/port.
>>>
>>> Suggested-by: Jakub Sitnicki <jakub@...udflare.com>
>>> Signed-off-by: Tiago Lam <tiagolam@...udflare.com>
>>> ---
>>>    net/ipv6/datagram.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    net/ipv6/udp.c      |  8 ++++--
>>>    2 files changed, 82 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
>>> index fff78496803d..4214dda1c320 100644
>>> --- a/net/ipv6/datagram.c
>>> +++ b/net/ipv6/datagram.c
>>> @@ -756,6 +756,27 @@ void ip6_datagram_recv_ctl(struct sock *sk, struct msghdr *msg,
>>>    }
>>>    EXPORT_SYMBOL_GPL(ip6_datagram_recv_ctl);
>>> +static inline bool reverse_sk_lookup(struct flowi6 *fl6, struct sock *sk,
>>> +				     struct in6_addr *saddr, __be16 sport)
>>> +{
>>> +	if (static_branch_unlikely(&bpf_sk_lookup_enabled) &&
>>> +	    (saddr && sport) &&
>>> +	    (ipv6_addr_cmp(&sk->sk_v6_rcv_saddr, saddr) || inet_sk(sk)->inet_sport != sport)) {
>>> +		struct sock *sk_egress;
>>> +
>>> +		bpf_sk_lookup_run_v6(sock_net(sk), IPPROTO_UDP, &fl6->daddr, fl6->fl6_dport,
>>> +				     saddr, ntohs(sport), 0, &sk_egress);
>>
>> iirc, in the ingress path, the sk could also be selected by a tc bpf prog
>> doing bpf_sk_assign. Then this re-run on sk_lookup may give an incorrect
>> result?
>>
> 
> If it does give the incorrect result, we still fallback to the normal
> egress path.
> 
>> In general, is it necessary to rerun any bpf prog if the user space has
>> specified the IP[v6]_ORIGDSTADDR.
>>
> 
> More generally, wouldn't that also be the case if someone calls
> bpf_sk_assign() in both TC and sk_lookup on ingress? It can lead to some
> interference between the two.
> 
> It seems like the interesting cases are:
> 1. Calling bpf_sk_assign() on both TC and sk_lookup ingress: if this
> happens sk_lookup on egress should match the correct socket when doing
> the reverse lookup;
> 2. Calling bpf_sk_assign() only on ingress TC: in this case it will
> depend if an sk_lookup program is attached or not:
>    a. If not, there's no reverse lookup on egress either;
>    b. But if yes, although the reverse sk_lookup here won't match the
>    initial socket assigned at ingress TC, the packets will still fallback
>    to the normal egress path;
> 
> You're right in that case 2b above will continue with the same
> restrictions as before.

imo, all these cases you described above is a good signal that neither the TC 
nor the BPF_PROG_TYPE_SK_LOOKUP program type is the right bpf prog to run here 
_if_ a bpf prog was indeed useful here.

I only followed some of the other discussion in v1 and v2. For now, I still 
don't see running a bpf prog is useful here to process the IP[V6]_ORIGDSTADDR. 
Jakub Sitnicki and I had discussed a similar point during the LPC.

If a bpf prog was indeed needed to process a cmsg, this should work closer to 
what Jakub Sitnicki had proposed for getting the meta data during LPC (but I 
believe the verdict there is also that a bpf prog is not needed). It should be a 
bpf prog that can work in a more generic way to process any BPF specific cmsg 
and can do other operations in the future using kfunc (e.g. route lookup or 
something). Saying yes/no to a particular local IP and port could be one of 
things that the bpf prog can do when processing the cmsg.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ