[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4A55E962.2010004@cn.fujitsu.com>
Date: Thu, 09 Jul 2009 20:58:10 +0800
From: Lai Jiangshan <laijs@...fujitsu.com>
To: Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
CC: Zhaolei <zhaolei@...fujitsu.com>,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
Steven Rostedt <rostedt@...dmis.org>,
linux-kernel@...r.kernel.org, fweisbec@...il.com,
Li Zefan <lizf@...fujitsu.com>,
Xiao Guangrong <xiaoguangrong@...fujitsu.com>
Subject: Re: [PATCH v4] ftrace: add a tracepoint for __raise_softirq_irqoff()
Ingo Molnar wrote:
> * Lai Jiangshan <laijs@...fujitsu.com> wrote:
>
>> Ingo Molnar wrote:
>>> * Zhaolei <zhaolei@...fujitsu.com> wrote:
>>>
>>>> * From: "Xiao Guangrong" <xiaoguangrong@...fujitsu.com>
>>>>> Steven Rostedt wrote:
>>>>>> On Thu, 14 May 2009, Mathieu Desnoyers wrote:
>>>>>>>> From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
>>>>>>>>
>>>>>>>> This patch is modified from Mathieu Desnoyers' patch. The original patch
>>>>>>>> can be found here:
>>>>>>>> http://marc.info/?l=linux-kernel&m=123791201816245&w=2
>>>>>>>> This tracepoint can trace the time stamp when softirq action is raised.
>>>>>>>>
>>>>>>>> Changelog for v1 -> v2:
>>>>>>>> 1: Use TRACE_EVENT instead of DEFINE_TRACE
>>>>>>>> 2: Move the tracepoint from raise_softirq_irqoff() to
>>>>>>>> __raise_softirq_irqoff()
>>>>>>>>
>>>>>>>> Changelog for v2 -> v3:
>>>>>>>> Move the definition of __raise_softifq_irqoff() to .c file when
>>>>>>>> CONFIG_TRACEPOINTS is enabled, to avoid recursive includes
>>>>>>>>
>>>>>>>> Changelog for v3 -> v4:
>>>>>>>> 1: Come back to v2, and use forward declarations to avoid
>>>>>>>> recursive includes as Mathieu's suggestion
>>>>>>>> 2: Modifiy the tracepoint name
>>>>>>>> 3: Add comments for this tracepoint
>>>>>>>>
>>>>>>> This is a step in the right direction, but please see my email to Lai
>>>>>>> about the fact that this assumes correct and undocumented include
>>>>>>> dependencies in kernel/trace/events.c. Not explicitely stating the
>>>>>>> include dependencies is a build error waiting to happen.
>>>>>>>
>>>>>>> Including interrupt.h under a ifdef would allow keeping track of
>>>>>>> TRACE_EVENT specific build dependencies neatly on a per header basis.
>>>>>> This is all moot, the events.c file no longer exists and as not an issue.
>>>>>>
>>>>> As Steve's says, use ftrace in ftrace.h not in events.c now.
>>>>> So, this mistake does not exist.
>>>>> Dose this patch has other error? I expect for your views.
>>>>>
>>>>> Thanks for your review, is great help to me. ;-)
>>>> Hello,
>>>>
>>>> It seems Mathieu has no other comments on this patch now.
>>>> Ingo, what is your opinion on this patch?
>>> There's a complication: this area of the softirq code needs fixes
>>> (unrelated to tracing).
>>>
>>> This API:
>>>
>>> inline void raise_softirq_irqoff(unsigned int nr)
>>> {
>>> __raise_softirq_irqoff(nr);
>>>
>>> /*
>>> * If we're in an interrupt or softirq, we're done
>>> * (this also catches softirq-disabled code). We will
>>> * actually run the softirq once we return from
>>> * the irq or softirq.
>>> *
>>> * Otherwise we wake up ksoftirqd to make sure we
>>> * schedule the softirq soon.
>>> */
>>> if (!in_interrupt())
>>> wakeup_softirqd();
>>> }
>>>
>>> is broken with RT tasks (as recently reported to lkml), as when a
>>> real-time task wakes up ksoftirqd (which has lower priority) it wont
>>> execute and we starve softirq execution.
>>>
>>> The proper solution would be to have a new API:
>>>
>>> raise_softirq_check()
>>>
>>> and to remove the wakeup_softirqd() hack from raise_softirq_irqoff()
>>> - and put raise_softirq_check() to all places that use
>>> raise_softirq*() from process context.
>>>
>>> raise_softirq_check() would execute softirq handlers from process
>>> context, if there's any pending ones. It has to be called outside of
>>> bh critical sections - i.e. often a bit after the raise_softirq()
>>> has been done.
>>>
>>> __raise_softirq_irqoff() would be made private to kernel/softirq.c,
>>> and we'd only have two public APIs to trigger softirqs:
>>> raise_softirq() and raise_softirq_irqoff(). Both just set the
>>> pending flag and dont do any wakeup.
>>>
>>> As a side-effect of these fixes, the tracepoints will be sorted out
>>> as well - there wont be any need to hack into
>>> __raise_softirq_irqoff().
>>>
>>> Ingo
>>>
>> Hi, Ingo
>>
>> Could you give me the .config file that this bug occurs?
>
> I havent reproduced this - it was reported in a recent thread on
> lkml. I dont have an URI for it - does anyone on the Cc: remember
> the details?
>
> Ingo
>
>
>
I haven't found the email you referred, I've searched it from
when you pointed out this softirq problem till now.
If anyone know it, please let me know.
If my analysis is correct, I will say this problem does NOT exist:
(non-PREEMPT_RT)
1) the (outmost) interrupt will call do_softirq() when it exits.
softirq will not be starved when interrupts occur.
2) the clocks will provide tick-interrupt periodic.
3) even when CONFIG_NO_HZ=y, the cpu still provide periodic tick
when current task is RT-task(which is non-idle task).
So any softirq will be called in time(one jiffy). Maybe my analysis is
incorrect or maybe the RT task disables the irq very long time or maybe
there is other bug in kernel which has influence and impacts softirq.
A) I will be still finding the email or the .config and probe
this problem. The endeavors to ensure all softirq handled
are not useful and it may not solve the problem. Because
the tick-interrupt does ensure pending softirq handled.
B) If my thinking is correct, it's the best that we insert
tracepoints at raise_softirq() functions, it's very helpful
and one of its benefit is help us to probe problems
which happen here.
Lai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists