lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1103170932430.12540@router.home>
Date:	Thu, 17 Mar 2011 09:40:33 -0500 (CDT)
From:	Christoph Lameter <cl@...ux.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
cc:	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-arch@...r.kernel.org, netdev <netdev@...r.kernel.org>,
	Netfilter Development Mailinglist 
	<netfilter-devel@...r.kernel.org>
Subject: Re: Poll about irqsafe_cpu_add and others

On Thu, 17 Mar 2011, Eric Dumazet wrote:

> irqsafe_cpu_{dec|inc} are used in network stack since 2.6.37 (commit
> 29b4433d991c88), and I would like to use irqsafe_cpu_add() in netfilter
> fast path too, and SNMP counters eventually (to lower ram needs by 50%)
>
> Initial support of irqsafe_ was given by Christoph in 2.6.34
>
> It seems only x86 arch is using a native and efficient implementation.

I have some draft(y old) patches for IA64 that use fetchadd together with
a per cpu virtual address range mapped differently for each processor but
in general the problem with other arches is that they do not have
instructions that avoid the expensive bus arbitration for a cacheline nor
do they have segment overrides.

The segment override effect of implicit relocation of the address to the
correct percpu area within one instruction can also be accomplished (like
in the IA64 case) if we have per cpu page tables for the kernel and map
the same virtual addres to different physical addresses for each cpu. Then
the address given to a percpu instruction is constant and the relocation
is performed implicitly by the MMU. Thus the relocation and the RMW
operation are "atomic". Either both occur or none.

The other aspect is that the arch must have cheap increment (and other
RMW) instructions that avoids expensive in cpu processing to establish a
coherent state of the cacheline.

> What about defining a HAVE_FAST_IRQSAFE_ADD ?

Useful at some point I think.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ