[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47BBEF95.6010307@bigtelecom.ru>
Date: Wed, 20 Feb 2008 12:15:01 +0300
From: Badalian Vyacheslav <slavon@...telecom.ru>
To: "Brandeburg, Jesse" <jesse.brandeburg@...el.com>
CC: netdev@...r.kernel.org
Subject: Re: e1000: Question about polling
Very big thanks for this answer. You ask for all my questions and for
all future questions too. Thanks Again!
> Badalian Vyacheslav wrote:
>
>> Hello all.
>>
>> Interesting think:
>>
>> Have PC that do NAT. Bandwidth about 600 mbs.
>>
>> Have 4 CPU (2xCoRe 2 DUO "HT OFF" 3.2 HZ).
>>
>> irqbalance in kernel is off.
>>
>> nat2 ~ # cat /proc/irq/217/smp_affinity
>> 00000001
>>
> this binds all 217 irq interrupts to cpu 0
>
>
>> nat2 ~ # cat /proc/irq/218/smp_affinity
>> 00000003
>>
>
> do you mean to be balancing interrupts between core 1 and 2 here?
> 1 = cpu 0
> 2 = cpu 1
> 4 = cpu 2
> 8 = cpu 3
>
> so 1+2 = 3 for irq 218, ie balancing between the two.
>
> sometimes the cpus will have a paired cache, depending on your bios it
> will be organized like cpu 0/2 = shared cache, and cput 1/3 = shared
> cache.
> you can find this out by looking at physical ID and CORE ID in
> /proc/cpuinfo
>
>
>> Load SI on CPU0 and CPU1 is about 90%
>>
>> Good... try do
>> echo ffffffff > /proc/irq/217/smp_affinity
>> echo ffffffff > /proc/irq/218/smp_affinity
>>
>> Get 100% SI at CPU0
>>
>> Question Why?
>>
>
> because as each adapter generating interrupts gets rotated through cpu0,
> it gets "stuck" on cpu0 because the napi scheduling can only run one at
> a time, and so each is always waiting in line behind the other to run
> its napi poll, always fills its quota (work_done is always != 0) and
> keeps interrupts disabled "forever"
>
>
>> I listen that if use IRQ from 1 netdevice to 1 CPU i can get 30%
>> perfomance... but i have 4 CPU... i must get more perfomance if i cat
>> "ffffffff" to smp_affinity.
>>
>
> only if your performance is not cache limited but cpu horsepower
> limited. you're sacrificing cache coherency for cpu power, but if that
> works for you then great.
>
>
>> picture looks liks this:
>> 0-3 CPU get over 50% SI.... bandwith up.... 55% SI... bandwith up...
>> 100% SI on CPU0....
>>
>> I remember patch to fix problem like it... patched function
>> e1000_clean... kernel on pc have this patch (2.6.24-rc7-git2)...
>> e1000 driver work much better (i up to 1.5-2x bandwidth before i get
>> 100% SI), but i think that it not get 100% that it can =)
>>
>
> the patch helps a little because it decreases the amount of time the
> driver spends in napi mode, basically shortening the exit condition
> (which reenables interrupts, and therefore balancing) to work_done <
> budget, not work_done == 0.
>
>
>> Thanks for answers and sorry for my English
>>
>
> you basically can't get much more than one cpu can do for each nic. its
> possible to get a little more, but my guess is you won't get much. The
> best thing you can do is make sure as much traffic as possible stays in
> the same cache, on two different cores.
>
> you can try turning off NAPI mode either in the .config, or build the
> sourceforge driver with CFLAGS_EXTRA=-DE1000_NO_NAPI, which seems
> counterintuitive, but with the non-napi e1000 pushing packets to the
> backlog queue on each cpu, you may actually get better performance due
> to the balancing.
>
> some day soon (maybe) we'll have some coherent way to have one tx and rx
> interrupt per core, and enough queues for each port to be able to handle
> 1 queue per core.
>
> good luck,
> Jesse
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists