lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACGkMEt-JpE0-WwEQYWLFijLoqRcWr9CV08otOR=Veg61aVcrA@mail.gmail.com>
Date: Wed, 13 Nov 2024 09:42:52 +0800
From: Jason Wang <jasowang@...hat.com>
To: Ilpo Järvinen <ij@...nel.org>
Cc: "Chia-Yu Chang (Nokia)" <chia-yu.chang@...ia-bell-labs.com>, 
	"Michael S. Tsirkin" <mst@...hat.com>, virtualization@...ts.linux.dev, 
	Eric Dumazet <edumazet@...gle.com>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, 
	"dsahern@...il.com" <dsahern@...il.com>, "davem@...emloft.net" <davem@...emloft.net>, 
	"dsahern@...nel.org" <dsahern@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>, 
	"joel.granados@...nel.org" <joel.granados@...nel.org>, "kuba@...nel.org" <kuba@...nel.org>, 
	"andrew+netdev@...n.ch" <andrew+netdev@...n.ch>, "horms@...nel.org" <horms@...nel.org>, 
	"pablo@...filter.org" <pablo@...filter.org>, "kadlec@...filter.org" <kadlec@...filter.org>, 
	"netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>, 
	"coreteam@...filter.org" <coreteam@...filter.org>, "ncardwell@...gle.com" <ncardwell@...gle.com>, 
	"Koen De Schepper (Nokia)" <koen.de_schepper@...ia-bell-labs.com>, 
	"g.white@...lelabs.com" <g.white@...lelabs.com>, 
	"ingemar.s.johansson@...csson.com" <ingemar.s.johansson@...csson.com>, 
	"mirja.kuehlewind@...csson.com" <mirja.kuehlewind@...csson.com>, "cheshire@...le.com" <cheshire@...le.com>, 
	"rs.ietf@....at" <rs.ietf@....at>, 
	"Jason_Livingood@...cast.com" <Jason_Livingood@...cast.com>, "vidhi_goel@...le.com" <vidhi_goel@...le.com>
Subject: Re: [PATCH v5 net-next 09/13] gro: prevent ACE field corruption &
 better AccECN handling

On Wed, Nov 13, 2024 at 5:19 AM Ilpo Järvinen <ij@...nel.org> wrote:
>
> Adding a few virtio people. Please see the virtio spec/flag question
> below.
>
> On Tue, 12 Nov 2024, Chia-Yu Chang (Nokia) wrote:
>
> > >From: Ilpo Järvinen <ij@...nel.org>
> > >Sent: Thursday, November 7, 2024 8:28 PM
> > >To: Eric Dumazet <edumazet@...gle.com>
> > >Cc: Chia-Yu Chang (Nokia) <chia-yu.chang@...ia-bell-labs.com>; netdev@...r.kernel.org; dsahern@...il.com; davem@...emloft.net; dsahern@...nel.org; pabeni@...hat.com; joel.granados@...nel.org; kuba@...nel.org; andrew+netdev@...n.ch; horms@...nel.org; pablo@...filter.org; kadlec@...filter.org; netfilter-devel@...r.kernel.org; coreteam@...filter.org; ncardwell@...gle.com; Koen De Schepper (Nokia) <koen.de_schepper@...ia-bell-labs.com>; g.white@...lelabs.com; ingemar.s.johansson@...csson.com; mirja.kuehlewind@...csson.com; cheshire@...le.com; rs.ietf@....at; Jason_Livingood@...cast.com; vidhi_goel@...le.com
> > >Subject: Re: [PATCH v5 net-next 09/13] gro: prevent ACE field corruption & better AccECN handling
> > >
> > >
> > >CAUTION: This is an external email. Please be very careful when clicking links or opening attachments. See the URL nok.it/ext for additional information.
> > >
> > >
> > >
> > >On Thu, 7 Nov 2024, Eric Dumazet wrote:
> > >
> > >>On Tue, Nov 5, 2024 at 11:07 AM <chia-yu.chang@...ia-bell-labs.com> wrote:
> > >>>
> > >>> From: Ilpo Järvinen <ij@...nel.org>
> > >>>
> > >>> There are important differences in how the CWR field behaves in
> > >>> RFC3168 and AccECN. With AccECN, CWR flag is part of the ACE counter
> > >>> and its changes are important so adjust the flags changed mask
> > >>> accordingly.
> > >>>
> > >>> Also, if CWR is there, set the Accurate ECN GSO flag to avoid
> > >>> corrupting CWR flag somewhere.
> > >>>
> > >>> Signed-off-by: Ilpo Järvinen <ij@...nel.org>
> > >>> Signed-off-by: Chia-Yu Chang <chia-yu.chang@...ia-bell-labs.com>
> > >>> ---
> > >>>  net/ipv4/tcp_offload.c | 4 ++--
> > >>>  1 file changed, 2 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c index
> > >>> 0b05f30e9e5f..f59762d88c38 100644
> > >>> --- a/net/ipv4/tcp_offload.c
> > >>> +++ b/net/ipv4/tcp_offload.c
> > >>> @@ -329,7 +329,7 @@ struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb,
> > >>>         th2 = tcp_hdr(p);
> > >>>         flush = (__force int)(flags & TCP_FLAG_CWR);
> > >>>         flush |= (__force int)((flags ^ tcp_flag_word(th2)) &
> > >>> -                 ~(TCP_FLAG_CWR | TCP_FLAG_FIN | TCP_FLAG_PSH));
> > >>> +                 ~(TCP_FLAG_FIN | TCP_FLAG_PSH));
> > >>>         flush |= (__force int)(th->ack_seq ^ th2->ack_seq);
> > >>>         for (i = sizeof(*th); i < thlen; i += 4)
> > >>>                 flush |= *(u32 *)((u8 *)th + i) ^ @@ -405,7 +405,7
> > >>> @@ void tcp_gro_complete(struct sk_buff *skb)
> > >>>         shinfo->gso_segs = NAPI_GRO_CB(skb)->count;
> > >>>
> > >>>         if (th->cwr)
> > >>> -               shinfo->gso_type |= SKB_GSO_TCP_ECN;
> > >>> +               shinfo->gso_type |= SKB_GSO_TCP_ACCECN;
> > >>>  }
> > >>>  EXPORT_SYMBOL(tcp_gro_complete);
> > >>>
> > >>
> > >> I do not really understand this patch. How a GRO engine can know which
> > >> ECN variant the peers are using ?
> > >
> > >Hi Eric,
> > >
> > >Thanks for taking a look.
> > >
> > >GRO doesn't know. Setting SKB_GSO_TCP_ECN in case of not knowing can result in header change that corrupts ACE field. Thus, GRO has to assume the RFC3168 CWR offloading trick cannot be used anymore (unless it tracks the connection and knows the bits are used for RFC3168 which is something nobody is going to do).
> > >
> > >The main point of SKB_GSO_TCP_ACCECN is to prevent SKB_GSO_TCP_ECN or NETIF_F_TSO_ECN offloading to be used for the skb as it would corrupt ACE field value.
> >
> > Hi Eric and Ilpo,
> >
> > From my understanding of another email thread (patch 08/13), it seems the conclusions that SKB_GSO_TCP_ACCECN will still be needed.
> >
> > >
> > >SKB_GSO_TCP_ACCECN doesn't allow CWR bits change within a super-skb but the same CWR flag should be repeated for all segments. In a sense, it's simpler than RFC3168 offloading.
> > >
> > >> SKB_GSO_TCP_ECN is also used from other points, what is your plan ?
> > >>
> > >> git grep -n SKB_GSO_TCP_ECN
> > >> drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:3888:
> > >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> > >> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1291:
> > >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> > >> drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:1312:
> > >> skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> > >
> > >These looked like they should be just changed to set SKB_GSO_TCP_ACCECN instead.
> >
> > I agree with these changes and will apply them in the next version.
> >
> > >
> > >I don't anymore recall why I didn't change those when I made this patch long time ago, perhaps it was just an oversight or things have changed somehow since then.
> > >
> > >> include/linux/netdevice.h:5061: BUILD_BUG_ON(SKB_GSO_TCP_ECN !=
> > >> (NETIF_F_TSO_ECN >> NETIF_F_GSO_SHIFT));
> > >> include/linux/skbuff.h:664:     SKB_GSO_TCP_ECN = 1 << 2,
> > >
> > >Not relevant.
> > >
> > >> include/linux/virtio_net.h:88:                  gso_type |= SKB_GSO_TCP_ECN;
> > >> include/linux/virtio_net.h:161:         switch (gso_type & ~SKB_GSO_TCP_ECN) {
> > >> include/linux/virtio_net.h:226:         if (sinfo->gso_type & SKB_GSO_TCP_ECN)
> > >
> > >These need to be looked further what's going on as UAPI is also
> > > involved here.
> >
> > I had a look at these parts, and only the 1st and 3rd ones are relevant.
> > Related to the 1st one, I propose to change
> > from
> >
> >                 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
> >                         gso_type |= SKB_GSO_TCP_ECN;
> >
> > to
> >
> >                 if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
> >                         gso_type |= SKB_GSO_TCP_ACCECN;
> >
> > The reason behind this proposed change is similar as the above changes
> > in en_rx.c.
>
> But en_rx.c is based one CWR flag, there's no similar check here.
>
> > For the 3rd one, I suggest to change from
> >
> >                 if (sinfo->gso_type & SKB_GSO_TCP_ECN)
> >                         hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
> >
> > to
> >
> >                 if (sinfo->gso_type & (SKB_GSO_TCP_ECN | SKB_GSO_TCP_ACCECN))
> >                         hdr->gso_type |= VIRTIO_NET_HDR_GSO_ECN;
> >
> > This proposed change is because VIRTIO_NET_HDR_GSO_ECN must be set to
> > allow TCP packets requiring segmentation offload which have ECN bit set.
> > So, no matter whether skb gso_type have GSO_TCP_ECN flag or
> > GSO_TCP_ACCECN flag, the corresponding hdr gso_type shall be set.
> >
> > But, I wonder what would the driver do when with VIRTIO_NET_HDR_GSO_ECN
> > flag. Will it corrupts CWR or not?
>
> I'm starting to heavily question what is the meaning of that
> VIRTIO_NET_HDR_GSO_ECN flag and if it's even consistent...
>
> https://github.com/qemu/qemu/blob/master/net/eth.c
>
> That sets VIRTIO_NET_HDR_GSO_ECN based on CE?? (Whereas kernel associates
> the related SKB_GSO_TCP_ECN to CWR offloading.)
>
> The virtio spec is way too vague to be useful so it would not be
> surprising if there are diverging interpretations from implementers:
>
> "If the driver negotiated the VIRTIO_NET_F_HOST_ECN feature, the
> VIRTIO_NET_HDR_GSO_ECN bit in gso_type indicates that the TCP packet has
> the ECN bit set."
>
> What is "the ECN bit" in terms used by TCP (or IP)? Could some virtio
> folks please explain?

According to the current Linux implementation in virtio_net_hdr_to_skb():

                if (hdr->gso_type & VIRTIO_NET_HDR_GSO_ECN)
                        gso_type |= SKB_GSO_TCP_ECN;

It mapps to SKB_GSO_TCP_ECN which is:

        /* This indicates the tcp segment has CWR set. */
        SKB_GSO_TCP_ECN = 1 << 2,

Thanks

>
>
> --
>  i.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ