lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALzJLG9AADcW0H3TNSDAKqu3d+me6T-R3pU9bVEk1wYLQvYVYw@mail.gmail.com>
Date:   Thu, 29 Nov 2018 17:03:09 -0800
From:   Saeed Mahameed <saeedm@....mellanox.co.il>
To:     maximmi@...lanox.com
Cc:     Saeed Mahameed <saeedm@...lanox.com>,
        Linux Netdev List <netdev@...r.kernel.org>,
        jasowang@...hat.com, Eric Dumazet <edumazet@...gle.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eran Ben Elisha <eranbe@...lanox.com>,
        Willem de Bruijn <willemb@...gle.com>,
        Tariq Toukan <tariqt@...lanox.com>
Subject: Re: Invalid transport_offset with AF_PACKET socket

On Wed, Nov 28, 2018 at 3:10 AM Maxim Mikityanskiy <maximmi@...lanox.com> wrote:
>
> Hi Saeed,
>
> > Can you elaborate more, what NIC? what configuration ? what do you mean
> > by confusion, anyway please see below
>
> ConnectX-4, after running `mlnx_qos -i eth1 --trust dscp`, which sets inline
> mode 2 (MLX5_INLINE_MODE_IP). I'll explain what I mean by confusion below.
>
> > in mlx5 with ConnectX4 or Connext4-LX there is a requirement to copy at
> > least the ethernet header to the tx descriptor otherwise this might
> > cause the packet to be dropped, and for RAW sockets the skb headers
> > offsets are not set, but the latest mlx5 upstream driver would know how
> > to handle this, and copy the minmum amount required
> > please see:
> >
> > static inline u16 mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
> >                                       struct sk_buff *skb)
>
> Yes, I know that, and what I do is debugging an issue with this function.
>
> >
> > it should default to:
> >
> >
> > case MLX5_INLINE_MODE_L2:
> >       default:
> >               hlen = mlx5e_skb_l2_header_offset(skb);
>
> The issue appears in MLX5_INLINE_MODE_IP. I haven't tested
> MLX5_INLINE_MODE_TCP_UDP yet, though.
>
> > So it should return at least 18 and not 14.
>
> Yes, the function does its best to return at least 18, but it silently expects
> skb_transport_offset to exceed 18. In normal conditions, it will be more that
> 18, because it will be at least 14 + 20. But in my case, when I send a packet
> via an AF_PACKET socket, skb_transport_offset returns 14 (which is nonsense),
> and the driver uses this value, causing the hardware to fail, because it's less
> than 18.
>

Got it, so even if you copy 18 it is not sufficient ! if the packet is
ipv4 or ipv6
and the inline mode is set to  MLX5_INLINE_MODE_IP in the vport
context you must copy the IP headers as well !

but what do you expect from AF_PACKET socket ? to parse each and every
packet and set skb_transport_offset ?

> > We had some issues with this in old driver such as kernels 4.14/15, and
> > it depends in the use case so i need some information first:
>
> No, it's not an old kernel. We actually have this bug in our internal bug
> tracking system, and I'm trying to resolve it.
>
> > 1. What Cards do you have ? (lspci)
>
> 03:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
> 03:00.1 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
> 81:00.0 Ethernet controller: Mellanox Technologies MT27520 Family [ConnectX-3 Pro]
>
> Testing with ConnectX-4.
>
> > 2. What kernel/driver version are you using ?
>
> I'm on net-next-mlx5, commit 66a4b5ef638a (the latest when I started the
> investigation).
>
> > 3. what is the current enum mlx5_inline_modes seen in
> > mlx5e_calc_min_inline or sq->min_inline_mode ?
>
> MLX5_INLINE_MODE_IP, as I said above.
>
> > 4. Firmware version ? (ethtool -i)
>
> 12.22.0238 (MT_2190110032)
>
> > can you share the packet format you are sending and seeing the bad
> > behavior with
>
> Here is the hexdump of the simplest packet that causes the problem when it's
> sent through AF_PACKET after `mlnx_qos -i eth1 --trust dscp`:
>
> 00000000: 11 22 33 44 55 66 77 88 99 aa bb cc 08 00 45 00
> 00000010: 00 20 00 00 40 00 40 11 ae a5 c6 12 00 01 c6 12
> 00000020: 00 02 00 00 4a 38 00 0c 29 82 61 62 63 64
>
> (Please ignore the wrong UDP checksum and non-existing MACs, it doesn't matter
> at all, I tested it with completely valid packets as well. The wrong UDP
> checksum is due to a bug in our internal pypacket utility).
>
> Thanks,
> Max

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ