[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231016164606.29484-1-kuniyu@amazon.com>
Date: Mon, 16 Oct 2023 09:46:06 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <willemdebruijn.kernel@...il.com>
CC: <andrii@...nel.org>, <ast@...nel.org>, <bpf@...r.kernel.org>,
<daniel@...earbox.net>, <davem@...emloft.net>, <dsahern@...nel.org>,
<edumazet@...gle.com>, <haoluo@...gle.com>, <john.fastabend@...il.com>,
<jolsa@...nel.org>, <kpsingh@...nel.org>, <kuba@...nel.org>,
<kuni1840@...il.com>, <kuniyu@...zon.com>, <martin.lau@...ux.dev>,
<mykolal@...com>, <netdev@...r.kernel.org>, <pabeni@...hat.com>,
<sdf@...gle.com>, <song@...nel.org>, <yonghong.song@...ux.dev>
Subject: Re: [PATCH v1 bpf-next 00/11] bpf: tcp: Add SYN Cookie generation/validation SOCK_OPS hooks.
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Date: Mon, 16 Oct 2023 10:19:18 -0400
> Kuniyuki Iwashima wrote:
> > Under SYN Flood, the TCP stack generates SYN Cookie to remain stateless
> > for the connection request until a valid ACK is responded to the SYN+ACK.
> >
> > The cookie contains two kinds of host-specific bits, a timestamp and
> > secrets, so only can it be validated by the generator. It means SYN
> > Cookie consumes network resources between the client and the server;
> > intermediate nodes must remember which nodes to route ACK for the cookie.
> >
> > SYN Proxy reduces such unwanted resource allocation by handling 3WHS at
> > the edge network. After SYN Proxy completes 3WHS, it forwards SYN to the
> > backend server and completes another 3WHS. However, since the server's
> > ISN differs from the cookie, the proxy must manage the ISN mappings and
> > fix up SEQ/ACK numbers in every packet for each connection. If a proxy
> > node is down, all the connections through it are also down. Keeping a
> > state at proxy is painful from that perspective.
> >
> > At AWS, we use a dirty hack to build truly stateless SYN Proxy at scale.
> > Our SYN Proxy consists of the front proxy layer and the backend kernel
> > module. (See slides of netconf [0], p6 - p15)
> >
> > The cookie that SYN Proxy generates differs from the kernel's cookie in
> > that it contains a secret (called rolling salt) (i) shared by all the proxy
> > nodes so that any node can validate ACK and (ii) updated periodically so
> > that old cookies cannot be validated. Also, ISN contains WScale, SACK, and
> > ECN, not in TS val. This is not to sacrifice any connection quality, where
> > some customers turn off the timestamp option due to retro CVE.
>
> If easier: I think it should be possible to make the host secret
> readable and writable with CAP_NET_ADMIN, to allow synchronizing
> between hosts.
I think the idea is doable for syncookie_secret and syncookie6_secret.
However, the cookie timestamp is generated based on jiffies that cannot
be written.
[ I answered sharing secrets would resolve our issue at netconf, but
I was wrong. ]
> For similar reasons as suggested here, a rolling salt might be
> useful more broadly too.
Maybe we need not use jiffies and can create a worker to update the
secret periodically if it's not configured manually.
The problem here would be that we need to update/read u64[4] atomically
if we want to use SipHash or HSipHash. Maybe this also can be changed.
But, we still want to use BPF as we need to encode (at least) WS and
SACK bits in ISN, not TS and use different MSS candidates rather than
msstab.
Also, in our use case, the validation for cookie itself is done in
the front proxy layer, and the kernel will do more light-weight
validation like checking if the cookie is forwarded from trusted
nodes. Then, we can prevent invalid ACK from flowing through the
backend and consuming some networking entries, and the backend need
not do full validation.
With BPF, we can get such flexibility at encoding and validation, and
making cookie generation algorithm private could be good for security.
Powered by blists - more mailing lists