netdev - RE: [PATCH (net.git) 2/4] stmmac: fix and better tune the default buffer sizes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6D0F6CCBB2@AcuExch.aculab.com>
Date:	Thu, 27 Feb 2014 13:31:52 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Giuseppe CAVALLARO' <peppe.cavallaro@...com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [PATCH (net.git) 2/4] stmmac: fix and better tune the default
 buffer sizes

From: Giuseppe CAVALLARO
...
> > Also (provided the hardware supports it) the rx buffers (are these
> > the ones being sized?) need to be aligned on a 4n+2 boundary in
> > order to avoid a realignment copy later on.
> 
> This is true and indeed I had added the STMMAC_ALIGN to align all.
> In the past to get the right alignment for SH4.
> 
> > So I'm not sure that some of these sizes are right and/or optimal.
> 
> What do you suggest?
> 
> Maybe, I can use a default for sure < 4KiB suitable to be used for VLAN
> frames (it will be aligned later).

Dunno... It rather depends on what the length is actually used for!
What you don’t want to be doing is adding 2 (for the 4n+2) and then
mallocing a 4096+2 byte buffer somewhere.

If the hardware does receive desegmentation, then you need to handle
the 64k+ receives somewhere.
If it doesn't then it doesn't matter if the hardware rx buffer size is
slightly too large (eg for VLAN or encapsulation full sized frames in PPoE).
1536 bytes for the memory buffer avoids cache line sharing (read to
offset 2).

The last ethernet driver I wrote from scratch (maybe 20 years ago) set
the rx-ring to point to an array of 512 byte buffers (last was shorter
to avoid an extra page) and did an aligned copy into the message buffer.
Only frames that crossed the ring end needed two copies.
ISTR making the copy be cache line aligned so that a special cache line
copy function could be used (I don't know if it ever was).
For that system the cost of the aligned data copies was less that the
complexity and cost of setting up the iommu.

	David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html