lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Aug 2015 16:12:57 -0700
From:	Joe Perches <joe@...ches.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	David Miller <davem@...emloft.net>,
	raghavendra.kt@...ux.vnet.ibm.com, edumazet@...gle.com,
	kuznet@....inr.ac.ru, jmorris@...ei.org, yoshfuji@...ux-ipv6.org,
	kaber@...sh.net, jiri@...nulli.us, hannes@...essinduktion.org,
	tom@...bertland.com, azhou@...ira.com, ebiederm@...ssion.com,
	ipm@...rality.org.uk, nicolas.dichtel@...nd.com,
	serge.hallyn@...onical.com, netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org, anton@....ibm.com,
	nacc@...ux.vnet.ibm.com, srikar@...ux.vnet.ibm.com
Subject: Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by
 walking all the percpu data at once

On Fri, 2015-08-28 at 15:29 -0700, Eric Dumazet wrote:
> On Fri, 2015-08-28 at 14:26 -0700, Joe Perches wrote:
> 1) u64 array[XX] on stack is naturally aligned,

Of course it is.

> kzalloc() wont improve this at all. Not sure what you believe.

An alloc would only reduce stack use.

Copying into the buffer, then copying the buffer into the
skb may be desirable on some arches though.

> 2) put_unaligned() is basically a normal memory write on x86.
>  memcpy(dst,src,...) will have a problem anyway on arches that care,
> because src & dst wont have same alignment.

OK, so all the world's an x86?

On arm32, copying 288 bytes using nearly all aligned word
transfers is generally faster than using only unsigned
short transfers.

> 288 bytes on stack in a leaf function in this path is totally fine, it
> is not like we're calling ext4/xfs/nfs code after this point.

Generally true.  It's always difficult to know how much
stack has been consumed though and smaller stack frames
are generally better.

Anyway, the block copy from either the alloc'd or stack
buffer amounts only to a slight performance improvement
for arm32.  It doesn't really have much other utility.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ