lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 19 Oct 2023 00:25:00 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: andrii@...nel.org, ast@...nel.org, bpf@...r.kernel.org,
 daniel@...earbox.net, davem@...emloft.net, dsahern@...nel.org,
 edumazet@...gle.com, haoluo@...gle.com, john.fastabend@...il.com,
 jolsa@...nel.org, kpsingh@...nel.org, kuba@...nel.org, kuni1840@...il.com,
 mykolal@...com, netdev@...r.kernel.org, pabeni@...hat.com, sdf@...gle.com,
 song@...nel.org, yonghong.song@...ux.dev, sinquersw@...il.com
Subject: Re: [PATCH v1 bpf-next 00/11] bpf: tcp: Add SYN Cookie
 generation/validation SOCK_OPS hooks.

On 10/18/23 3:31 PM, Kuniyuki Iwashima wrote:
> From: Kui-Feng Lee <sinquersw@...il.com>
> Date: Wed, 18 Oct 2023 14:47:43 -0700
>> On 10/18/23 10:20, Kuniyuki Iwashima wrote:
>>> From: Eric Dumazet <edumazet@...gle.com>
>>> Date: Wed, 18 Oct 2023 10:02:51 +0200
>>>> On Wed, Oct 18, 2023 at 8:19 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>>>>>
>>>>> On 10/17/23 9:48 AM, Kuniyuki Iwashima wrote:
>>>>>> From: Martin KaFai Lau <martin.lau@...ux.dev>
>>>>>> Date: Mon, 16 Oct 2023 22:53:15 -0700
>>>>>>> On 10/13/23 3:04 PM, Kuniyuki Iwashima wrote:
>>>>>>>> Under SYN Flood, the TCP stack generates SYN Cookie to remain stateless
>>>>>>>> After 3WHS, the proxy restores SYN and forwards it and ACK to the backend
>>>>>>>> server.  Our kernel module works at Netfilter input/output hooks and first
>>>>>>>> feeds SYN to the TCP stack to initiate 3WHS.  When the module is triggered
>>>>>>>> for SYN+ACK, it looks up the corresponding request socket and overwrites
>>>>>>>> tcp_rsk(req)->snt_isn with the proxy's cookie.  Then, the module can
>>>>>>>> complete 3WHS with the original ACK as is.
>>>>>>>
>>>>>>> Does the current kernel module also use the timestamp bits differently?
>>>>>>> (something like patch 8 and patch 10 trying to do)
>>>>>>
>>>>>> Our SYN Proxy uses TS as is.  The proxy nodes generate a random number
>>>>>> if TS is in SYN.
>>>>>>
>>>>>> But I thought someone would suggest making TS available so that we can
>>>>>> mock the default behaviour at least, and it would be more acceptable.
>>>>>>
>>>>>> The selftest uses TS just to strengthen security by validating 32-bits
>>>>>> hash.  Dropping a part of hash makes collision easier to happen, but
>>>>>> 24-bits were sufficient for us to reduce SYN flood to the managable
>>>>>> level at the backend.
>>>>>
>>>>> While enabling bpf to customize the syncookie (and timestamp), I want to explore
>>>>> where can this also be done other than at the tcp layer.
>>>>>
>>>>> Have you thought about directly sending the SYNACK back at a lower layer like
>>>>> tc/xdp after receiving the SYN?
>>>
>>> Yes.  Actually, at netconf I mentioned the cookie generation hook will not
>>> be necessary and should be replaced with XDP.

Right, it is also what I have been thinking when seeing the 
BPF_SOCK_OPS_GEN_SYNCOOKIE_CB carrying the bpf generated timestamp to the 
tcp_make_synack. It feels like trying hard to work with the tcp want_cookie 
logic while there is an existing better alternative in tc/xdp to deal with synflood.

>>>
>>>
>>>>> There are already bpf_tcp_{gen,check}_syncookie
>>>>> helper that allows to do this for the performance reason to absorb synflood. It
>>>>> will be natural to extend it to handle the customized syncookie also.
>>>
>>> Maybe we even need not extend it and can use XDP as said below.
>>>
>>>
>>>>>
>>>>> I think it should already be doable to send a SYNACK back with customized
>>>>> syncookie (and timestamp) at tc/xdp today.
>>>>>
>>>>> When ack is received, the prog@...xdp can verify the cookie. It will probably
>>>>> need some new kfuncs to create the ireq and queue the child socket. The bpf prog
>>>>> can change the ireq->{snd_wscale, sack_ok...} if needed. The details of the
>>>>> kfuncs need some more thoughts. I think most of the bpf-side infra is ready,
>>>>> e.g. acquire/release/ref-tracking...etc.
>>>>>
>>>>
>>>> I think I mostly agree with this.
>>>
>>> I didn't come up with kfunc to create ireq and queue it to listener, so
>>> cookie_v[46]_check() were best place for me to extend easily, but now it
>>> sounds like kfunc would be the way to go.
>>>
>>> Maybe we can move the core part of cookie_v[46]_check() except for kernel
>>> cookie's validation to __cookie_v[46]_check() and expose a wrapper of it
>>> as kfunc ?
>>>
>>> Then, we can look up sk and pass the listener, skb, and flags (for sack_ok,
>>> etc) to the kfunc.  (It could still introduce some conflicts with Eric's
>>> patch though...)
>>
>> Does that mean the packets handled in this way (in XDP) will skip all
>> netfilter at all?
> 
> Good point.
> 
> If we want not to skip other layers, maybe we can use tc ?
> 
> 1) allocate ireq and set sack_ok etc with kfunc
> 2) bpf_sk_assign() to set ireq to skb (this could be done in kfunc above)
> 3) let inet_steal_sock() return req->sk_listener if not sk_fullsock(sk)
> 4) if skb->sk is reqsk in cookie_v[46]_check(), skip validation and
>     req allocation and create full sk

Haven't looked at the details. The above feels reasonable and would be nice if 
it works out. don't know if the skb at tc can be used in cookie_v[46]_check() as 
is. It probably needs more thoughts.  [ note, xdp does not have skb. ]

Regarding the "allocate ireq and set sack_ok etc with kfunc", do you think it 
will be useful (and potentially cleaner) even for the 
BPF_SOCK_OPS_CHECK_SYNCOOKIE_CB if it needed to go back to consider skops? Then 
only do the BPF_SOCK_OPS_CHECK_SYNCOOKIE_CB and the xdp/tc can generate SYNACK. 
The xdp/tc can still do the check and drop the bad ACK earlier in the stack.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ