[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20200727172014.ynvc5vty5bbg7wsp@bsd-mbp.dhcp.thefacebook.com>
Date: Mon, 27 Jul 2020 10:20:14 -0700
From: Jonathan Lemon <jonathan.lemon@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: netdev <netdev@...r.kernel.org>, kernel-team <kernel-team@...com>,
Christoph Hellwig <hch@....de>,
Robin Murphy <robin.murphy@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Willem de Bruijn <willemb@...gle.com>,
Steffen Klassert <steffen.klassert@...unet.com>,
Saeed Mahameed <saeedm@...lanox.com>,
Maxim Mikityanskiy <maximmi@...lanox.com>,
bjorn.topel@...el.com, magnus.karlsson@...el.com,
borisp@...lanox.com, david@...hat.com
Subject: Re: [RFC PATCH v2 13/21] net/tcp: Pad TCP options out to a fixed
size for netgpu
On Mon, Jul 27, 2020 at 08:19:24AM -0700, Eric Dumazet wrote:
> On Mon, Jul 27, 2020 at 12:51 AM Jonathan Lemon
> <jonathan.lemon@...il.com> wrote:
> >
> > From: Jonathan Lemon <bsd@...com>
> >
> > The "header splitting" feature used by netgpu doesn't actually parse
> > the incoming packet header. Instead, it splits the packet at a fixed
> > offset. In order for this to work, the sender needs to send packets
> > with a fixed header size.
> >
> > Signed-off-by: Jonathan Lemon <jonathan.lemon@...il.com>
> > ---
> > net/ipv4/tcp_output.c | 20 ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> >
> > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> > index d8f16f6a9b02..e8a74d0f7ad2 100644
> > --- a/net/ipv4/tcp_output.c
> > +++ b/net/ipv4/tcp_output.c
> > @@ -438,6 +438,7 @@ struct tcp_out_options {
> > u8 ws; /* window scale, 0 to disable */
> > u8 num_sack_blocks; /* number of SACK blocks to include */
> > u8 hash_size; /* bytes in hash_location */
> > + u8 pad_size; /* additional nops for padding */
> > __u8 *hash_location; /* temporary pointer, overloaded */
> > __u32 tsval, tsecr; /* need to include OPTION_TS */
> > struct tcp_fastopen_cookie *fastopen_cookie; /* Fast open cookie */
> > @@ -562,6 +563,17 @@ static void tcp_options_write(__be32 *ptr, struct tcp_sock *tp,
> > smc_options_write(ptr, &options);
> >
> > mptcp_options_write(ptr, opts);
> > +
> > +#if IS_ENABLED(CONFIG_NETGPU)
> > + /* pad out options */
> > + if (opts->pad_size) {
> > + int len = opts->pad_size;
> > + u8 *p = (u8 *)ptr;
> > +
> > + while (len--)
> > + *p++ = TCPOPT_NOP;
> > + }
> > +#endif
> > }
> >
> > static void smc_set_option(const struct tcp_sock *tp,
> > @@ -826,6 +838,14 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb
> > opts->num_sack_blocks * TCPOLEN_SACK_PERBLOCK;
> > }
> >
> > +#if IS_ENABLED(CONFIG_NETGPU)
> > + /* force padding */
> > + if (size < 20) {
> > + opts->pad_size = 20 - size;
> > + size += opts->pad_size;
> > + }
> > +#endif
> > +
>
> This is obviously wrong, as any kernel compiled with CONFIG_NETGPU
> will fail all packetdrill tests suite.
>
> Also the fixed 20 value is not pretty.
Would changing this into a sysctl be a suitable solution? It really is
a temporary solution to handle hardware that doesn't support splitting,
and adding a sysctl seems so permanent.....
--
Jonathan
Powered by blists - more mailing lists