[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200806091911.21850.mitov@issp.bas.bg>
Date: Mon, 9 Jun 2008 19:11:21 +0300
From: Marin Mitov <mitov@...p.bas.bg>
To: Ingo Molnar <mingo@...e.hu>
Cc: LKML <linux-kernel@...r.kernel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-rt-users <linux-rt-users@...r.kernel.org>, akpm@...l.org,
Clark Williams <clark.williams@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
"Luis Claudio R. Goncalves" <lclaudio@...g.org>,
Gregory Haskins <ghaskins@...ell.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andi Kleen <andi-suse@...stfloor.org>
Subject: Re: [PATCH][resubmit] x86: enable preemption in delay
On Monday 09 June 2008 03:13:09 pm Ingo Molnar wrote:
>
> * Marin Mitov <mitov@...p.bas.bg> wrote:
>
> > Hi all,
> >
> > This is a resubmit of the patch proposed by Ingo and me few month ago:
> >
> > http://lkml.org/lkml/2007/11/20/343
> >
> > It was in -mm for a while and then removed due to a move into the
> > mainline, but it never appeared in it.
>
> hm, we've got this overlapping commit in -tip for the same general
> problem:
>
> | # x86/urgent: e71e716: x86: enable preemption in delay
> |
> | From: Steven Rostedt <rostedt@...dmis.org>
> | Date: Sun, 25 May 2008 11:13:32 -0400
> | Subject: [PATCH] x86: enable preemption in delay
>
> i think Thomas had a concern with the original fix - forgot the details.
Here they are:
http://lkml.org/lkml/2008/5/25/251
but see my comment to it.
>
> Ingo
There is no principal difference between both patches. I have seen Steven's
one as merged in linux-2.6.26-rc5. The only difference (if it matters of all) is
that in mine patch preempt_disable()/preempt_enable() sections are shorter
and protect only the code that must be protected:
preempt_disable()
rdtscl()
smp_processor_id()
preempt_enable()
As far as Steven's patch is already merged - let it be :-)
Regards
Marin Mitov
>
> ------------------>
> commit e71e716c531557308895598002bee24c431d3be1
> Author: Steven Rostedt <rostedt@...dmis.org>
> Date: Sun May 25 11:13:32 2008 -0400
>
> x86: enable preemption in delay
>
> The RT team has been searching for a nasty latency. This latency shows
> up out of the blue and has been seen to be as big as 5ms!
>
> Using ftrace I found the cause of the latency.
>
> pcscd-2995 3dNh1 52360300us : irq_exit (smp_apic_timer_interrupt)
> pcscd-2995 3dN.2 52360301us : idle_cpu (irq_exit)
> pcscd-2995 3dN.2 52360301us : rcu_irq_exit (irq_exit)
> pcscd-2995 3dN.1 52360771us : smp_apic_timer_interrupt (apic_timer_interrupt
> )
> pcscd-2995 3dN.1 52360771us : exit_idle (smp_apic_timer_interrupt)
>
> Here's an example of a 400 us latency. pcscd took a timer interrupt and
> returned with "need resched" enabled, but did not reschedule until after
> the next interrupt came in at 52360771us 400us later!
>
> At first I thought we somehow missed a preemption check in entry.S. But
> I also noticed that this always seemed to happen during a __delay call.
>
> pcscd-2995 3dN.2 52360836us : rcu_irq_exit (irq_exit)
> pcscd-2995 3.N.. 52361265us : preempt_schedule (__delay)
>
> Looking at the x86 delay, I found my problem.
>
> In git commit 35d5d08a085c56f153458c3f5d8ce24123617faf, Andrew Morton
> placed preempt_disable around the entire delay due to TSC's not working
> nicely on SMP. Unfortunately for those that care about latencies this
> is devastating! Especially when we have callers to mdelay(8).
>
> Here I enable preemption during the loop and account for anytime the task
> migrates to a new CPU. The delay asked for may be extended a bit by
> the migration, but delay only guarantees that it will delay for that minimum
> time. Delaying longer should not be an issue.
>
> [
> Thanks to Thomas Gleixner for spotting that cpu wasn't updated,
> and to place the rep_nop between preempt_enabled/disable.
> ]
>
> Signed-off-by: Steven Rostedt <srostedt@...hat.com>
> Cc: akpm@...l.org
> Cc: Clark Williams <clark.williams@...il.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: "Luis Claudio R. Goncalves" <lclaudio@...g.org>
> Cc: Gregory Haskins <ghaskins@...ell.com>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Andi Kleen <andi-suse@...stfloor.org>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
>
> diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
> index 4535e6d..d710f2d 100644
> --- a/arch/x86/lib/delay_32.c
> +++ b/arch/x86/lib/delay_32.c
> @@ -44,13 +44,36 @@ static void delay_loop(unsigned long loops)
> static void delay_tsc(unsigned long loops)
> {
> unsigned long bclock, now;
> + int cpu;
>
> - preempt_disable(); /* TSC's are per-cpu */
> + preempt_disable();
> + cpu = smp_processor_id();
> rdtscl(bclock);
> - do {
> - rep_nop();
> + for (;;) {
> rdtscl(now);
> - } while ((now-bclock) < loops);
> + if ((now - bclock) >= loops)
> + break;
> +
> + /* Allow RT tasks to run */
> + preempt_enable();
> + rep_nop();
> + preempt_disable();
> +
> + /*
> + * It is possible that we moved to another CPU, and
> + * since TSC's are per-cpu we need to calculate
> + * that. The delay must guarantee that we wait "at
> + * least" the amount of time. Being moved to another
> + * CPU could make the wait longer but we just need to
> + * make sure we waited long enough. Rebalance the
> + * counter for this CPU.
> + */
> + if (unlikely(cpu != smp_processor_id())) {
> + loops -= (now - bclock);
> + cpu = smp_processor_id();
> + rdtscl(bclock);
> + }
> + }
> preempt_enable();
> }
>
> diff --git a/arch/x86/lib/delay_64.c b/arch/x86/lib/delay_64.c
> index bbc6105..4c441be 100644
> --- a/arch/x86/lib/delay_64.c
> +++ b/arch/x86/lib/delay_64.c
> @@ -31,14 +31,36 @@ int __devinit read_current_timer(unsigned long *timer_value)
> void __delay(unsigned long loops)
> {
> unsigned bclock, now;
> + int cpu;
>
> - preempt_disable(); /* TSC's are pre-cpu */
> + preempt_disable();
> + cpu = smp_processor_id();
> rdtscl(bclock);
> - do {
> - rep_nop();
> + for (;;) {
> rdtscl(now);
> + if ((now - bclock) >= loops)
> + break;
> +
> + /* Allow RT tasks to run */
> + preempt_enable();
> + rep_nop();
> + preempt_disable();
> +
> + /*
> + * It is possible that we moved to another CPU, and
> + * since TSC's are per-cpu we need to calculate
> + * that. The delay must guarantee that we wait "at
> + * least" the amount of time. Being moved to another
> + * CPU could make the wait longer but we just need to
> + * make sure we waited long enough. Rebalance the
> + * counter for this CPU.
> + */
> + if (unlikely(cpu != smp_processor_id())) {
> + loops -= (now - bclock);
> + cpu = smp_processor_id();
> + rdtscl(bclock);
> + }
> }
> - while ((now-bclock) < loops);
> preempt_enable();
> }
> EXPORT_SYMBOL(__delay);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists