[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20070530131458.GA5432@atjola.homenet>
Date: Wed, 30 May 2007 15:14:58 +0200
From: Björn Steinbrink <B.Steinbrink@....de>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Michal Piotrowski <michal.k.k.piotrowski@...il.com>,
Ian Kumlien <iank@...dband.net>, Linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...e.hu>,
Arjan van de Ven <arjan@...radead.org>
Subject: Re: [BUG] Something goes wrong with timer statistics.
On 2007.05.30 14:44:49 +0200, Eric Dumazet wrote:
> On Wed, 30 May 2007 13:38:25 +0200
> Björn Steinbrink <B.Steinbrink@....de> wrote:
>
> > On 2007.05.30 00:38:08 +0200, Thomas Gleixner wrote:
> > > On Wed, 2007-05-30 at 00:24 +0200, Michal Piotrowski wrote:
> > > > On 29/05/07, Ian Kumlien <iank@...dband.net> wrote:
> > > > > Hi,
> > > > >
> > > > > As the daystar sets, i try to play some with my new would be
> > > > > firewall/server, but since this will be running for quite some time i
> > > > > have been experimenting with powertop to find out what i can do to limit
> > > > > it's power usage.
> > > > >
> > > > > But, if i run powertop for too long or a few times to many... this
> > > > > happens:
> > > > > http://pomac.netswarm.net/pics/kernel_panic.jpg
> > >
> > > This was reported before. It's incredibly hard to reproduce.
> >
> > OK, second try, this time with a patch. In timer_stats_update_stats,
> > input is allocated on the stack, so it is uninitialized, in particular
> > the "next" field is random. Now in tstat_lookup, the new entry "curr" is
> > initialized with the values from "input" (passed as "entry") and "next"
> > is set to NULL _after_ it is added to the list, so if a second CPU is
> > running the fastpath lookup while we're inserting the new element, it
> > might get the garbage value in "next". The patch below fixes that.
> >
> > Björn
> >
> >
> >
> > Initialize the "next" field of a timer stats entry before it is inserted
> > into the list to avoid a race with the fastpath lookup.
> >
> > Signed-off-by: Björn Steinbrink <B.Steinbrink@....de>
> > ---
> > diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> > index 868f1bc..5bc8f91 100644
> > --- a/kernel/time/timer_stats.c
> > +++ b/kernel/time/timer_stats.c
> > @@ -202,12 +202,12 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
> > if (curr) {
> > *curr = *entry;
> > curr->count = 0;
> > + curr->next = NULL;
> > memcpy(curr->comm, comm, TASK_COMM_LEN);
> > if (prev)
> > prev->next = curr;
> > else
> > *head = curr;
> > - curr->next = NULL;
> > }
> > out_unlock:
> > spin_unlock(&table_lock);
> >
>
> Your analysis might be right, not the fix.
>
> You *cannot* assume curr->next = NULL will be done before insert.
>
> You probably also need a memory barrier.
Ehrm, right. I somehow thought of the spinlock being enough to satisfy
that constraint, but if that was the case, the whole problem wouldn't
exist in the first place. D'oh!
Thanks,
Björn
Initialize the "next" field of a timer stats entry before it is inserted
into the list to avoid a race with the fastpath lookup.
Thanks to Eric Dumazet for reminding me of the memory barrier.
Signed-off-by: Björn Steinbrink <B.Steinbrink@....de>
---
diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
index 868f1bc..ab0ba6c 100644
--- a/kernel/time/timer_stats.c
+++ b/kernel/time/timer_stats.c
@@ -202,12 +202,15 @@ static struct entry *tstat_lookup(struct entry *entry, char *comm)
if (curr) {
*curr = *entry;
curr->count = 0;
+ curr->next = NULL;
memcpy(curr->comm, comm, TASK_COMM_LEN);
+
+ smp_mb(); /* Ensure that curr is initialized before insert */
+
if (prev)
prev->next = curr;
else
*head = curr;
- curr->next = NULL;
}
out_unlock:
spin_unlock(&table_lock);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists