lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 13 Mar 2018 19:24:44 +0100
From:   Jakob Unterwurzacher <jakob.unterwurzacher@...obroma-systems.com>
To:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        John Fastabend <john.fastabend@...il.com>,
        "David S. Miller" <davem@...emloft.net>
Cc:     "linux-can@...r.kernel.org" <linux-can@...r.kernel.org>,
        Martin Elshuber <martin.elshuber@...obroma-systems.com>
Subject: [bug, bisected] pfifo_fast causes packet reordering

During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on 
Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of 
packets are delivered out-of-order.

We have tracked the problem down to the driver interface level, and it 
seems that the driver's net_device_ops.ndo_start_xmit() function gets 
the packets handed over in the wrong order.

This behavior was not observed on Linux v4.15 and I have bisected the 
problem down to this patch:

> commit c5ad119fb6c09b0297446be05bd66602fa564758
> Author: John Fastabend <john.fastabend@...il.com>
> Date:   Thu Dec 7 09:58:19 2017 -0800
> 
>    net: sched: pfifo_fast use skb_array
> 
>    This converts the pfifo_fast qdisc to use the skb_array data structure
>    and set the lockless qdisc bit. pfifo_fast is the first qdisc to support
>    the lockless bit that can be a child of a qdisc requiring locking. So
>    we add logic to clear the lock bit on initialization in these cases when
>    the qdisc graft operation occurs.
> 
>    This also removes the logic used to pick the next band to dequeue from
>    and instead just checks a per priority array for packets from top priority
>    to lowest. This might need to be a bit more clever but seems to work
>    for now.
> 
>    Signed-off-by: John Fastabend <john.fastabend@...il.com>
>    Signed-off-by: David S. Miller <davem@...emloft.net>

The patch does not revert cleanly, but moving to one commit earlier 
makes the problem go away.

Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem 
go away as well.

Is this an unintended side-effect of the patch or is there something the 
driver has to do to request in-order delivery?

Thanks,
Jakob

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ