lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090520125609.GG4530@serverengines.com>
Date:	Wed, 20 May 2009 18:26:11 +0530
From:	Ajit Khaparde <ajitk@...verengines.com>
To:	David Miller <davem@...emloft.net>
Cc:	netdev@...r.kernel.org
Subject: Re: [net-next-2.6 PATCH][be2net] remove napi in the tx path and do
	tx completion processing in interrupt context

Comments inline.
 
Thanks
Ajit

On 19/05/09 15:13 -0700, David Miller wrote:
> From: Ajit Khaparde <ajitk@...verengines.com>
> Date: Tue, 19 May 2009 17:40:58 +0530
> 
> > This patch will remove napi in tx path and do Tx compleiton
> > processing in interrupt context.  This makes Tx completion
> > processing simpler without loss of performance.
> > 
> > Signed-off-by: Ajit Khaparde <ajitk@...verengines.com>
> 
> This is different from how every other NAPI driver does this.
> 
> You should have a single NAPI context, that handles both TX and RX
> processing.  Except, that for TX processing, no work budget
> adjustments are made.  You simply unconditionally process all pending
> TX work without accounting it into the POLL call budget.
> 
> I have no idea why this driver tried to split the RX and TX
> work like this, it accomplishes nothing but add overhead.
> Simply add the TX completion code to the RX poll handler
> and that's all you need to do.  Also, make sure to run TX
> polling work before RX polling work, this makes fresh SKBs
> available for responses generated by RX packet processing.
Splitting RX and TX interrupts helps with performance in certain work loads.
TX completion processing load is not significant in TCP traffic due to
TSO.  RX and TX completion processing load on same CPU limits throughput
in work loads that have significant UDP RX and TX.

> 
> I bet this is why you really saw performance problems, rather than
> something to do with running it directly in interrupt context.  There
> should be zero gain from that if you do the TX poll work properly in
> the RX poll handler.  When you free TX packets in hardware interrupt
> context using dev_kfree_skb_any() that just schedules a software
> interrupt to do the actual SKB free, which adds just more overhead for
> TX processing work.  You aren't avoiding software IRQ work by doing TX
> processing in the hardware interrupt handler, in fact you
> theoretically are doing more.
dev_kfree_skb_any() scheduling a software interrupt was something that we
had overlooked.

> 
> So the only conclusion I can come to is that what is important is
> doing the TX completion work before the RX packets get processed in
> the NAPI poll handler, and you accomplish that more efficiently and
> more properly by simply moving the TX completion work to the top of
> the RX poll handler code.
> 
> You also failed to provide any indication of what kind of
> performance change occurs due to this patch, you just say
> that it does occur.  This gives people very little confidence
> in your changes, nor does it give them an idea of the magnitude
> of the performance changes caused by this change.
> 
> The kind of analysis I did above is the kind of things I expect
> to find in a commit message that makes delicate changes like this
> for reasons of performance, but I see none of that.
> 
> I'm not applying this patch, and your commit messages are still way
> too terse.  You don't go into the testing you did, how you discovered
> the problem, what made you decide to solve the problem this way, why
> that approache works, nor the precise performance improvements you
> saw.  In short, you give zero information as to the merits of this
> change, and therefore nobody can review it properly.
We did not see any change in throughput or CPU utilization by doing TX
completion processing in interrupt context instead of the NAPI context.
The objective of the patch was to simplify the TX completion processing.

We will come back with a proper patch if necessary, after some more analysis.

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ