linux-kernel - Re: eth_type_trans(): Re: [Bug #11308] tbench regression on each kernel release from 2.6.22 -> 2.6.28

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4922818B.1020303@cosmosbay.com>
Date:	Tue, 18 Nov 2008 09:49:15 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	David Miller <davem@...emloft.net>, torvalds@...ux-foundation.org,
	rjw@...k.pl, linux-kernel@...r.kernel.org,
	kernel-testers@...r.kernel.org, cl@...ux-foundation.org,
	efault@....de, a.p.zijlstra@...llo.nl, shemminger@...tta.com
Subject: Re: eth_type_trans(): Re: [Bug #11308] tbench regression on each
 kernel release from 2.6.22 -&gt; 2.6.28

Ingo Molnar a écrit :
> * David Miller <davem@...emloft.net> wrote:
> 
>> From: Ingo Molnar <mingo@...e.hu>
>> Date: Mon, 17 Nov 2008 22:26:57 +0100
>>
>>> eth->h_proto access.
>> Yes, this is the first time a packet is touched on receive.
>>
>>> Given that this workload does localhost networking, my guess would be 
>>> that eth->h_proto is bouncing around between 16 CPUs? At minimum this 
>>> read-mostly field should be separated from the bouncing bits.
>> It's the packet contents, there is no way to "seperate it".
>>
>> And it should be unlikely bouncing on your system under tbench, the 
>> senders and receivers should hang out on the same cpu unless the 
>> something completely stupid is happening.
>>
>> That's why I like running tbench with a num_threads command line 
>> argument equal to the number of cpus, every cpu gets the two thread 
>> talking to eachother over the TCP socket.
> 
> yeah - and i posted the numbers for that too - it's the same 
> throughput, within ~1% of noise.

Thinking once again about loopback driver, I recall a previous attempt
to call netif_receive_skb() instead of netif_rx() and pay the price
of cache line ping-pongs between cpus.

http://kerneltrap.org/mailarchive/linux-netdev/2008/2/21/939644

Maybe we could do that, with a temporary percpu stack, like we do in softirq
when CONFIG_4KSTACKS=y

(arch/x86/kernel/irq_32.c  : call_on_stack(func, stack)

And do this only if the current cpu doesnt already use its softirq_stack
(think about loopback re-entering loopback xmit because of TCP ACK for example)

Oh well... black magic, you are going to kill me :)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/