[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070628143850.GA11780@ms2.inr.ac.ru>
Date: Thu, 28 Jun 2007 18:38:50 +0400
From: Alexey Kuznetsov <kuznet@....inr.ac.ru>
To: Ingo Molnar <mingo@...e.hu>
Cc: Jeff Garzik <jeff@...zik.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Christoph Hellwig <hch@...radead.org>,
john stultz <johnstul@...ibm.com>,
Oleg Nesterov <oleg@...sign.ru>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Dipankar Sarma <dipankar@...ibm.com>,
"David S. Miller" <davem@...emloft.net>, matthew.wilcox@...com
Subject: Re: [RFC PATCH 0/6] Convert all tasklets to workqueues
Hello!
> the context-switch argument i'll believe if i see numbers. You'll
> probably need in excess of tens of thousands of irqs/sec to even be able
> to measure its overhead. (workqueues are driven by nice kernel threads
> so there's no TLB overhead, etc.)
It was authors of the patch who were supposed to give some numbers,
at least one or two, just to prove the concept. :-)
According to my measurements (maybe, wrong) on 2.5GHz P4 tasklet
schedule and execution eats ~300ns, workqueue eats ~4usec.
On my 1.8GHz PM notebook (UP kernel), the numbers are 170ns and 1.2usec.
Formally looking awful, this result is positive: tasklets are almost
never used in hot paths. I am sure only about one such place: acenic
driver uses tasklet to refill rx queue. This generates not more than
3000 tasklet schedules per second. Even on P4 it pure workqueue schedule
will eat ~1% of bare cpu ticks.
Anyway, all the uses of tasklet should be verified:
The most dubios place is popular Neterion 10Gbit driver, which uses
tasklet like acenic. But at 10Gbit, multiply acenic numbers and panic. :-)
Also, there exists some hardware which uses tasklets even harder,
but I have no idea what real frequencies are: f.e. sundance.
The case with acenic/s2io is quite special: normally network drivers
refill queues in irq handlers. It was Jes Sorensen observation
that offloading refilling from irq improves performance, I do not
remember numbers. Probably, switching to workqueues will not affect
performance at all, probably it will just collapse, no idea.
> ... workqueues are also possibly much more scalable
I cannot figure out - scale in what direction? :-)
> (percpu workqueues
> are easy without changing anything in your code but the call where you
> create the workqueue).
I do not see how it is related to scalability. And the statement
does not even make sense. The patch already uses per-cpu workqueue
for tasklets, otherwise it would be a disaster: guaranteed cpu non-locality.
Tasklet is single thread by definition and purpose. Those a few places
where people used tasklets to do per-cpu jobs (RCU f.e.) exist just because
they had troubles with allocating new softirq. Workqueues do not make
any difference: tasklet is not workqueue, it is work_struct, and you
still will have to allocate array of per-cpu work structs, everything
remains the same.
> the only remaining argument is latency:
You could set realtime prioriry by default, not a poor nice -5.
If some network adapters were killed just because I run some task
with nice --22, it would be just ridiculous.
Alexey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists