[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 29 Apr 2010 22:02:19 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Stephen Hemminger <shemminger@...tta.com>,
Andi Kleen <ak@...goyle.fritz.box>, netdev@...r.kernel.org,
Andi Kleen <andi@...stfloor.org>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: OFT - reserving CPU's for networking
Le jeudi 29 avril 2010 à 21:19 +0200, Thomas Gleixner a écrit :
> Say thanks to Intel/AMD for providing us timers which stop in lower
> c-states.
>
> Not much we can do about the broadcast lock when several cores are
> going idle and we need to setup a global timer to work around the
> lapic timer stops in C2/C3 issue.
>
> Simply the C-state timer broadcasting does not scale. And it was never
> meant to scale. It's a workaround for laptops to have functional NOHZ.
>
> There are several ways to work around that on larger machines:
>
> - Restrict c-states
> - Disable NOHZ and highres timers
> - idle=poll is definitely the worst of all possible solutions
>
> > I keep getting asked about taking some core's away from clock and scheduler
> > to be reserved just for network processing. Seeing this kind of stuff
> > makes me wonder if maybe that isn't a half bad idea.
>
> This comes up every few month and we pointed out several times what
> needs to be done to make this work w/o these weird hacks which put a
> core offline and then start some magic undebugable binary blob on it.
> We have not seen anyone working on this, but the "set cores aside and
> let them do X" idea seems to stick in peoples heads.
>
> Seriously, that's not a solution. It's going to be some hacked up
> nightmare which is completely unmaintainable.
>
> Aside of that I seriously doubt that you can do networking w/o time
> and timers.
>
Thanks a lot !
booting with processor.max_cstate=1 solves the problem
(I already had a CONFIG_NO_HZ=no conf, but highres timer enabled)
Even with _carefuly_ chosen crazy configuration (receiving a packet on a
cpu, then transfert it to another cpu, with a full 16x16 matrix
involved), generating 700.000 IPI per second on the machine seems fine
now.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists