netdev - Re: [PATCH 3/3] net: hisilicon: Add Fast Ethernet MAC driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4787248.HAfgOdBKVY@wuerfel>
Date:	Tue, 28 Jun 2016 11:34:26 +0200
From:	Arnd Bergmann <arnd@...db.de>
To:	Dongpo Li <lidongpo@...ilicon.com>
Cc:	f.fainelli@...il.com, robh+dt@...nel.org, mark.rutland@....com,
	davem@...emloft.net, xuejiancheng@...ilicon.com,
	netdev@...r.kernel.org, devicetree@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] net: hisilicon: Add Fast Ethernet MAC driver

On Tuesday, June 28, 2016 5:21:19 PM CEST Dongpo Li wrote:
> On 2016/6/15 5:20, Arnd Bergmann wrote:
> > On Tuesday, June 14, 2016 9:17:44 PM CEST Li Dongpo wrote:
> >> On 2016/6/13 17:06, Arnd Bergmann wrote:
> >>> On Monday, June 13, 2016 2:07:56 PM CEST Dongpo Li wrote:
> >>> You tx function uses BQL to optimize the queue length, and that
> >>> is great. You also check xmit reclaim for rx interrupts, so
> >>> as long as you have both rx and tx traffic, this should work
> >>> great.
> >>>
> >>> However, I notice that you only have a 'tx fifo empty'
> >>> interrupt triggering the napi poll, so I guess on a tx-only
> >>> workload you will always end up pushing packets into the
> >>> queue until BQL throttles tx, and then get the interrupt
> >>> after all packets have been sent, which will cause BQL to
> >>> make the queue longer up to the maximum queue size, and that
> >>> negates the effect of BQL.
> >>>
> >>> Is there any way you can get a tx interrupt earlier than
> >>> this in order to get a more balanced queue, or is it ok
> >>> to just rely on rx packets to come in occasionally, and
> >>> just use the tx fifo empty interrupt as a fallback?
> >>>
> >> In tx direction, there are only two kinds of interrupts, 'tx fifo empty'
> >> and 'tx one packet finish'. I didn't use 'tx one packet finish' because
> >> it would lead to high hardware interrupts rate. This has been verified in
> >> our chips. It's ok to just use tx fifo empty interrupt.
> > 
> > I'm not convinced by the explanation, I don't think that has anything
> > to do with the hardware design, but instead is about the correctness
> > of the BQL logic with your driver.
> > 
> > Maybe your xmit function can do something like
> > 
> >       if (dql_avail(netdev_get_tx_queue(dev, 0)->dql) < 0)
> >               enable per-packet interrupt
> >       else
> >               use only fifo-empty interrupt
> > 
> > That way, you don't get a lot of interrupts when the system is
> > in a state of packets being received and sent continuously,
> > but if you get to the point where your tx queue fills up
> > and no rx interrupts arrive, you don't have to wait for it
> > to become completely empty before adding new packets, and
> > BQL won't keep growing the queue.
> > 
> Hi, Arnd
> I tried enable per-packet interrupt when tx queue full in xmit function
> and disable it in NAPI poll. But the number of interrupts are a little
> bigger than only using fifo-empty interrupt.

Right, I'd expect that to be the case, it basically means that the
algorithm works as expected.

Just to be sure you didn't have extra interrupts: you only enable the
per-packet interrupts if interrupts are currently enabled, not in
NAPI polling mode, right?

> The other hand, this is a fast ethernet MAC. Its maximum speed is 100Mbps.
> This speed is very easily achived and the efficiency of the BQL is not
> so important. What we focus on is the lower cpu utilization.
> So I think it is okay to just use the tx fifo empty interrupt.

BQL is not about efficiency, it's about keeping the latency down, which
is at least as important for low-throughput devices as it is for faster
ones. I don't think that disabling BQL here would be the right answer,
you'd just end up with the maximum TX queue length all the time.

Your queue length is 12 packets of 1500 bytes, meaning that you have 1.4ms
of latency at 100mbit/s rate, or 14ms for 10mbit/s. This is much less
than most, but it's probably still worth using BQL on it.

	Arnd