netdev - Re: [PATCH bpf-next v3 4/4] selftests/bpf: add icmp_send

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d680a7b0-4c92-4937-83a7-6044e17e9997@linux.dev>
Date: Tue, 29 Jul 2025 17:32:07 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Mahe Tardy <mahe.tardy@...il.com>
Cc: alexei.starovoitov@...il.com, andrii@...nel.org, ast@...nel.org,
 bpf@...r.kernel.org, coreteam@...filter.org, daniel@...earbox.net,
 fw@...len.de, john.fastabend@...il.com, netdev@...r.kernel.org,
 netfilter-devel@...r.kernel.org, oe-kbuild-all@...ts.linux.dev,
 pablo@...filter.org, lkp@...el.com
Subject: Re: [PATCH bpf-next v3 4/4] selftests/bpf: add icmp_send_unreach
 kfunc tests

On 7/29/25 5:01 PM, Martin KaFai Lau wrote:
> On 7/29/25 4:27 PM, Martin KaFai Lau wrote:
>> On 7/29/25 2:09 AM, Mahe Tardy wrote:
>>> On Mon, Jul 28, 2025 at 06:18:11PM -0700, Martin KaFai Lau wrote:
>>>> On 7/28/25 2:43 AM, Mahe Tardy wrote:
>>>>> +SEC("cgroup_skb/egress")
>>>>> +int egress(struct __sk_buff *skb)
>>>>> +{
>>>>> +    void *data = (void *)(long)skb->data;
>>>>> +    void *data_end = (void *)(long)skb->data_end;
>>>>> +    struct iphdr *iph;
>>>>> +    struct tcphdr *tcph;
>>>>> +
>>>>> +    iph = data;
>>>>> +    if ((void *)(iph + 1) > data_end || iph->version != 4 ||
>>>>> +        iph->protocol != IPPROTO_TCP || iph->daddr != bpf_htonl(SERVER_IP))
>>>>> +        return SK_PASS;
>>>>> +
>>>>> +    tcph = (void *)iph + iph->ihl * 4;
>>>>> +    if ((void *)(tcph + 1) > data_end ||
>>>>> +        tcph->dest != bpf_htons(SERVER_PORT))
>>>>> +        return SK_PASS;
>>>>> +
>>>>> +    kfunc_ret = bpf_icmp_send_unreach(skb, unreach_code);
>>>>> +
>>>>> +    /* returns SK_PASS to execute the test case quicker */
>>>>
>>>> Do you know why the user space is slower if 0 (SK_DROP) is used?
>>>
>>> I tried to write my understanding of this in the commit description:
>>>
>>> "Note that the BPF program returns SK_PASS to let the connection being
>>> established to finish the test cases quicker. Otherwise, you have to
>>> wait for the TCP three-way handshake to timeout in the kernel and
>>> retrieve the errno translated from the unreach code set by the ICMP
>>> control message."
>>
>> This feels like a bit hacky to let the 3WHS finished while the objective of 
>> the patch set is to drop it. It is not unusual for people to directly borrow 
>> this code. Does non blocking connect() help?
>>
> 
> After reading more on how sk_err_soft is used, non blocking won't help. I think 
> I see why tcp rst is better.
> 

Actually, while replying on the cover letter and looking at tcp_v4_err again, 
there is an exception to do ip_icmp_error for TCP_SYN_SENT, so it may worth a 
try on non blocking connect and then poll the sk for err if you haven't tried 
that before.