lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 28 Oct 2009 02:54:37 -0700
From:	"Vladislav Zolotarov" <vladz@...adcom.com>
To:	"David Miller " <IMCEAMAILTO-davem+40davemloft+2Enet@...adcom.com>
cc:	"Eilon Greenstein" <eilong@...adcom.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next] bnx2x: Do Tx handling in a separate
 tasklet.

I'd like to start from your last remark: you r absolutely right, and this is the problem we have in the current net-next driver. More than that, this patch is fixing this problem: it moved liberation of Tx SKBs from hardIRQ context (ISR) to the softIRQ context (tasklet) thereby resolving the problem u've mentioned. So, total agreement with u on this one. I must have named the patch differently to emphasize it.

I'd like to summarize the patch I've sent:
- Take Tx SKB liberation out of hardIRQ.
- Instead schedule a DPC that handles Tx work.
- Optimize the access to status block indices: read only the index we are about to use in the current context.

So, could u, pls., apply the patch in order to fix the problem we currently have in bnx2x?

Bullet 1 is correct but not complete: what about SKB's needed for filling Rx ring, what about Tx-only scenarios where u'd prefer to liberate SKBs from start_xmit()? Generally, we'd like to do SKB liberation both from start_xmit and from NAPI. Making it straight forward would make us take a tx_lock from inside NAPI and this is what we surely don't wan't to do. We are working on this optimization at the moment and it will be the topic of one of the next patches.

Regarding the second bullet u wrote: saying "low CPU consumption" in regard of Tx work in my previous e-mail I meant that CPU per packet ratio for Tx is much lower than for Rx. Sorry for being unclear.

Best regards
vlad


-----Original Message-----
From: David Miller [mailto:davem@...emloft.net] 
Sent: Tuesday, October 27, 2009 12:28 AM
To: Vladislav Zolotarov
Cc: Eilon Greenstein; netdev@...r.kernel.org
Subject: Re: [PATCH net-next] bnx2x: Do Tx handling in a separate tasklet.

From: "Vladislav Zolotarov" <vladz@...adcom.com>
Date: Mon, 26 Oct 2009 07:42:27 -0700

> The separation of Tx and Rx interrupt handling gives us the
> possibility to properly affinitize the Rx (heavy CPU consuming task)
> and Tx (low CPU consuming task) and to ensure that Tx work is done
> not long after the Tx interrupt without interference of Rx work thus
> letting the user benefit from Tx coalescing configuration in order
> to achieve the best performance in each specific scenario. This is
> most important in heavy load scenarios with mixed traffic (UDP + TCP
> for instance). If we didn't separate Tx and Rx interrupt handling Tx
> coalescing configuration was not worth much.

There are other issues:

1) Actually, it makes sense to do TX and RX work together, since TX
   packet liberation makes fresh CPU local packets available for
   responses generated by RX packet reception.

2) TX packet liberation is not low CPU consumption, it has to perform
   many atomic instructions, reference socket state, enter the SLAB
   allocator, potentially liberate netfilter state, etc.

Using NAPI also moves the TX freeing into softirq context.

If you do it from a hardirq you are making it more expensive.  From
hardirq the free just puts the SKB on a list, schedules a softirq,
then does the real SKB free work from the softirq.

This needless SKB list management and softirq scheduling you'll
avoid if you do things from softirqs, and thus using NAPI makes
sense here.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ