lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 02 Feb 2009 22:31:41 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Wes Chow <wchow@...enacr.com>
CC:	netdev@...r.kernel.org
Subject: Re: Multicast packet loss

Wes Chow a écrit :
> 
> 
> Eric Dumazet wrote:
>> Wes Chow a écrit :
>>> (I'm Kenny's colleague, and I've been doing the kernel builds)
>>>
>>> First I'd like to note that there were a lot of bnx2 NAPI changes
>>> between 2.6.21 and 2.6.22. As a reminder, 2.6.21 shows tiny amounts
>>> of packet loss,
>>> whereas loss in 2.6.22 is significant.
>>>
>>> Second, some CPU affinity info: if I do like Eric and pin all of the
>>> apps onto a single CPU, I see no packet loss. Also, I do *not* see
>>> ksoftirqd show up on top at all!
>>>
>>> If I pin half the processes on one CPU and the other half on another
>>> CPU, one ksoftirqd processes shows up in top and completely pegs one
>>> CPU. My packet loss
>>> in that case is significant (25%).
>>>
>>> Now, the strange case: if I pin 3 processes to one CPU and 1 process
>>> to another, I get about 25% packet loss and ksoftirqd pins one CPU.
>>> However, one
>>> of the apps takes significantly less CPU than the others, and all
>>> apps lose the
>>> *exact same number of packets*. In all other situations where we see
>>> packet
>>> loss, the actual number lost per application instance appears random.
>>
>> You see same number of packet lost because they are lost at NIC level
> 
> Understood.
> 
> I have a new observation: if I pin processes to just CPUs 0 and 1, I see
> no packet loss. Pinning to 0 and 2, I do see packet loss. Pinning 2 and
> 3, no packet loss. 4 & 5 - no packet loss, 6 & 7 - no packet loss. Any
> other combination appears to produce loss (though I have not tried all
> 28 combinations, this seems to be the case).
> 
> At first I thought maybe it had to do with processes pinned to the same
> CPU, but different cores. The machine is a dual quad core, which means
> that CPUs 0-3 should be a physical CPU, correct? Pinning to 0/2 and 0/3
> produce packet loss.

a quad core is really a 2 x 2 core

L2 cache is splited on two blocks, one block used by CPU0/1, other by CPU2/3 

You are at the limit of the machine with such workload, so as soon as your
CPUs have to transfert 64 bytes lines between those two L2 blocks, you loose.


> 
> I've also noticed that it does not matter which of the working pairs I
> pin to. For example, pinning 5 processes in any combination on either
> 0/1 produce no packet loss, pinning all 5 to just CPU 0 also produces no
> packet loss.
> 
> The failures are also sudden. In all of the working cases mentioned
> above, I don't see ksoftirqd on top at all. But when I run 6 processes
> on a single CPU, ksoftirqd shoots up to 100% and I lose a huge number of
> packets.
> 
>>
>> Normaly, softirq runs on same cpu (the one handling hard irq)
> 
> What determines which CPU the hard irq occurs on?
> 

Check /proc/irq/{irqnumber}/smp_affinity

If you want IRQ16 only served by CPU0 :

echo 1 >/proc/irq/16/smp_affinity

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ