netdev - Re: [net-next-2.6 PATCH][be2net] remove napi in the tx path and do tx completion processing in interrupt context

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20090519.151334.87370735.davem@davemloft.net>
Date:	Tue, 19 May 2009 15:13:34 -0700 (PDT)
From:	David Miller <davem@...emloft.net>
To:	ajitk@...verengines.com
Cc:	netdev@...r.kernel.org
Subject: Re: [net-next-2.6 PATCH][be2net] remove napi in the tx path and do
 tx completion processing in interrupt context

From: Ajit Khaparde <ajitk@...verengines.com>
Date: Tue, 19 May 2009 17:40:58 +0530

> This patch will remove napi in tx path and do Tx compleiton
> processing in interrupt context.  This makes Tx completion
> processing simpler without loss of performance.
> 
> Signed-off-by: Ajit Khaparde <ajitk@...verengines.com>

This is different from how every other NAPI driver does this.

You should have a single NAPI context, that handles both TX and RX
processing.  Except, that for TX processing, no work budget
adjustments are made.  You simply unconditionally process all pending
TX work without accounting it into the POLL call budget.

I have no idea why this driver tried to split the RX and TX
work like this, it accomplishes nothing but add overhead.
Simply add the TX completion code to the RX poll handler
and that's all you need to do.  Also, make sure to run TX
polling work before RX polling work, this makes fresh SKBs
available for responses generated by RX packet processing.

I bet this is why you really saw performance problems, rather than
something to do with running it directly in interrupt context.  There
should be zero gain from that if you do the TX poll work properly in
the RX poll handler.  When you free TX packets in hardware interrupt
context using dev_kfree_skb_any() that just schedules a software
interrupt to do the actual SKB free, which adds just more overhead for
TX processing work.  You aren't avoiding software IRQ work by doing TX
processing in the hardware interrupt handler, in fact you
theoretically are doing more.

So the only conclusion I can come to is that what is important is
doing the TX completion work before the RX packets get processed in
the NAPI poll handler, and you accomplish that more efficiently and
more properly by simply moving the TX completion work to the top of
the RX poll handler code.

You also failed to provide any indication of what kind of
performance change occurs due to this patch, you just say
that it does occur.  This gives people very little confidence
in your changes, nor does it give them an idea of the magnitude
of the performance changes caused by this change.

The kind of analysis I did above is the kind of things I expect
to find in a commit message that makes delicate changes like this
for reasons of performance, but I see none of that.

I'm not applying this patch, and your commit messages are still way
too terse.  You don't go into the testing you did, how you discovered
the problem, what made you decide to solve the problem this way, why
that approache works, nor the precise performance improvements you
saw.  In short, you give zero information as to the merits of this
change, and therefore nobody can review it properly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html