lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGVrzca=L1XxrFXJH03iYOjY3LttQ_dvYgOHdavz_8W+iL6iWw@mail.gmail.com>
Date:	Mon, 18 Nov 2013 09:58:38 -0800
From:	Florian Fainelli <f.fainelli@...il.com>
To:	Willy Tarreau <w@....eu>
Cc:	Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
	simon.guinot@...uanux.org, netdev <netdev@...r.kernel.org>,
	Arnaud Ebalard <arno@...isbad.org>, edumazet@...gle.com,
	Cong Wang <xiyou.wangcong@...il.com>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s

Hello Willy, Thomas,

2013/11/18 Willy Tarreau <w@....eu>:
> Hi Thomas,
>
> On Mon, Nov 18, 2013 at 11:26:01AM +0100, Thomas Petazzoni wrote:
>> I haven't read the entire discussion yet, but do you guys have
>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/clk/mvebu?id=1022c75f5abd3a3b25e679bc8793d21bedd009b4
>> applied? It got merged recently, and it fixes a number of networking
>> problems on Armada 370.
>
> No, because my version was even older than the code which introduced this
> issue :-)
>
> The main issue is related to something we discussed once ago which surprized
> both of us, the use of a Tx timer to release the Tx descriptors. I remember
> I considered that it was not a big issue because the flush was also done in
> the Rx path (thus on ACKs) but I can't find trace of this code so my analysis
> was wrong. Thus we can hit some situations where we fill the descriptors
> before filling the link.

So long as you are using TCP this works because the ACKs will somehow
create an artificial "forced" completion of your transmitted SKBs, how
about an UDP streamer use case? In that case you will quickly fill up
all of your descriptors and have to wait for the descriptors to be
freed by the 10ms timer. I do not think this is desirable at all, and
this will requite very large UDP sender socket buffers. I remember
asking Thomas what was the reason for not using the TX completion IRQ
during the first incarnation of the driver, but I do not quite
remember what was the answer.

If the original mvneta driver authors fears where that TX completion
could generate too many IRQs, they should use netif_stop_queue() /
netif_wake_queue() and mask off/on interrupts appropriately to slow
down the pace of TX interrupts.

>
> Ideally we should have a Tx IRQ. At the very least we should call the tx
> refill function in mvneta_poll() I believe. I can try to do it but I'd
> rather have the Tx IRQ working instead.

Right, actually you should do both, free transmitted SKBs from your
NAPI poll callback and from the TX completion IRQ to ensure SKBs are
freed up in time no matter what workload/use case is being used.
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ