lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20161124.110416.198867271899443489.davem@davemloft.net>
Date:   Thu, 24 Nov 2016 11:04:16 -0500 (EST)
From:   David Miller <davem@...emloft.net>
To:     pavel@....cz
Cc:     peppe.cavallaro@...com, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: stmmac ethernet in kernel 4.9-rc6: coalescing related pauses.

From: Pavel Machek <pavel@....cz>
Date: Thu, 24 Nov 2016 09:55:06 +0100

> Hi!
> 
>> I'm debugging strange delays during transmit in stmmac driver. They
>> seem to be present in 4.4 kernel (and older kernels, too). Workload is
>> burst of udp packets being sent, pause, burst of udp packets, ...
>> 
>> Test code is attached, I use these parameters for testing:
>> 
>> ./udp-test raw 10.0.0.6 1234 1000 100 30
>> 
>> The delays seem to be related to coalescing:
>> 
>> drivers/net/ethernet/stmicro/stmmac/common.h
>> #define STMMAC_COAL_TX_TIMER    40000
>> #define STMMAC_MAX_COAL_TX_TICK 100000
>> #define STMMAC_TX_MAX_FRAMES    256
>> 
>> If I lower the parameters, delays are gone, but I get netdev watchdog
>> backtrace followed by broken driver.
>> 
>> Any ideas what is going on there?
> 
> 4.9-rc6 still has the delays. With the
> 
> #define STMMAC_COAL_TX_TIMER 1000
> #define STMMAC_TX_MAX_FRAMES 2
> 
> settings, delays go away, and driver still works. (It fails fairly
> fast in 4.4). Good news. But the question still is: what is going on
> there?

256 packets looks way too large for being a trigger for aborting the
TX coalescing timer.

Looking more deeply into this, the driver is using non-highres timers
to implement the TX coalescing.  This simply cannot work.

1 HZ, which is the lowest granularity of non-highres timers in the
kernel, is variable as well as already too large of a delay for
effective TX coalescing.

I seriously think that the TX coalescing support should be ripped out
or disabled entirely until it is implemented properly in this driver.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ