lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Mar 2018 09:44:44 +0100
From:   Steffen Klassert <steffen.klassert@...unet.com>
To:     Eric Dumazet <eric.dumazet@...il.com>
CC:     Yonghong Song <yhs@...com>, Daniel Borkmann <daniel@...earbox.net>,
        "Alexei Starovoitov" <ast@...com>, netdev <netdev@...r.kernel.org>,
        Martin Lau <kafai@...com>
Subject: Re: BUG_ON triggered in skb_segment

On Mon, Mar 12, 2018 at 11:25:09PM -0700, Eric Dumazet wrote:
> 
> 
> On 03/12/2018 11:08 PM, Yonghong Song wrote:
> > 
> > 
> > On 3/12/18 11:04 PM, Eric Dumazet wrote:
> > > 
> > > 
> > > On 03/12/2018 10:45 PM, Yonghong Song wrote:
> > > > ...
> > > > Setup:
> > > > =====
> > > > 
> > > > The test will involve three machines:
> > > >    M_ipv6 <-> M_nat <-> M_ipv4
> > > > 
> > > > The M_nat will do ipv4<->ipv6 address translation and then
> > > > forward packet
> > > > to proper destination. The control plane will configure M_nat properly
> > > > will understand virtual ipv4 address for machine M_ipv6, and
> > > > virtual ipv6 address for machine M_ipv4.
> > > > 
> > > > M_nat runs a bpf program, which is attached to clsact (ingress) qdisc.
> > > > The program uses bpf_skb_change_proto to do protocol conversion.
> > > > bpf_skb_change_proto will adjust skb header_len and len properly
> > > > based on protocol change.
> > > > After the conversion, the program will make proper change on
> > > > ethhdr and ip4/6 header, recalculate checksum, and send the packet out
> > > > through bpf_redirect.
> > > > 
> > > > Experiment:
> > > > ===========
> > > > 
> > > > MTU: 1500B for all three machines.
> > > > 
> > > > The tso/lro/gro are enabled on the M_nat box.
> > > > 
> > > > ping works on both ways of M_ipv6 <-> M_ipv4.
> > > > It works for transfering a small file (4KB) between M_ipv6 and
> > > > M_ipv4 (both ways).
> > > > Transfering a large file (e.g., 4MB) from M_ipv6 to M_ipv4,
> > > > failed with the above BUG_ON, really fast.
> > > > Did not really test from M_ipv4 to M_ipv6 with large file.
> > > > 
> > > > The error path likely to be (also from the above call stack):
> > > >    nic -> lro/gro -> bpf_program -> gso (BUG_ON)

Just out of curiosity, are these packets created with LRO or GRO?
Usually LRO is disabled if forwarding is enabled on a machine,
because segmented LGO packets are likely corrupt.

These packets take an alternative redirect path, so not sure what
happens here.

> > > > 
> > > > In one of experiments, I explicitly printed the skb->len and
> > > > skb->data_len. The values are below:
> > > >    skb_segment: len 2856, data_len 2686
> > > > They should be equal to avoid BUG.
> > > > 
> > > > In another experiment, I got:
> > > >    skb_segment: len 1428, data_len 1258
> > > > 
> > > > In both cases, the difference is 170 bytes. Not sure whether
> > > > this is just a coincidence or not.
> > > > 
> > > > Workaround:
> > > > ===========
> > > > 
> > > > A workaround to avoid BUG_ON is to disable lro/gro. This way,
> > > > kernel will not receive big packets and hence gso is not really called.
> > > > 
> > > > I am not familiar with gso code. Does anybody hit this BUG_ON before?
> > > > Any suggestion on how to debug this?
> > > > 
> > > 
> > > skb_segment() works if incoming GRO packet is not modified in its
> > > geometry.
> > > 
> > > In your case it seems you had to adjust gso_size (calling
> > > skb_decrease_gso_size() or skb_increase_gso_size()), and this breaks
> > > skb_segment() badly, because geometry changes, unless you had
> > > specific MTU/MSS restrictions.
> > > 
> > > You will have to make skb_segment() more generic if you really want this.
> > 
> > In net/core/filter.c function bpf_skb_change_proto, which is called
> > in the bpf program, does some GSO adjustment. Could you help check
> > whether it satisfies my above use case or not? Thanks!
> 
> As I said this  helper ends up modifying gso_size by +/- 20 (sizeof(ipv6
> header) - sizeof(ipv4 header))
> 
> So it wont work if skb_segment() is called after this change.

Even HW TSO use gso_size to segment the packets. Would'nt this
result in broken packets too, if gso_size is modified on a
forwarding path?

> 
> Not clear why the GRO packet is not sent as is (as a TSO packet) since
> mlx4/mlx5 NICs certainly support TSO.

If the packets are generated with GRO, there could be data chained
at the frag_list pointer. Most NICs can't offload such skbs, so if
skb_segment() can't split at the frag_list pointer, it will just
segment the packets based on gso_size.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ