[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150213185328.GA19746@redhat.com>
Date: Fri, 13 Feb 2015 19:53:28 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Nicholas Mc Guire <der.herr@...r.at>
Cc: Davidlohr Bueso <dave@...olabs.net>, paulmck@...ux.vnet.ibm.com,
linux-kernel@...r.kernel.org, waiman.long@...com,
peterz@...radead.org, raghavendra.kt@...ux.vnet.ibm.com
Subject: Re: BUG: spinlock bad magic on CPU#0, migration/0/9
On 02/13, Nicholas Mc Guire wrote:
>
> On Thu, 12 Feb 2015, Oleg Nesterov wrote:
>
> > Nicholas, sorry, I sent the patch but forgot to CC you.
> > See https://lkml.org/lkml/2015/2/12/587
> >
> > And please note that "completion" was specially designed to guarantee
> > that complete() can't play with this memory after wait_for_completion/etc
> > returns.
> >
>
> hmmm.... I guess that "falling out of context" can happen in a number of cases
> with completion - any of the timeout/interruptible variants e.g:
>
> void xxx(void)
> {
> struct completion c;
>
> init_completion(&c);
>
> expose_this_completion(&c);
>
> wait_for_completion_timeout(&c,A_FEW_JIFFIES);
> }
>
> and if the other side did not call complete() within A_FEW_JIFFIES then
> it would result in the same failure - I don't think the API can prevent
> this type of bug.
Yes sure, but in this case the user of wait_for_completion_timeout() should
blame itself, it is simply buggy.
> Tt has to be ensured by additional locking
Yes, but
> drivers/misc/tifm_7xx1.c:tifm_7xx1_resume() resolve this issue by resetting
> the completion to NULL and testing for !NULL before calling complete()
> with appropriate locking protection access.
I don't understand this code, I can be easily wrong. but at first glance it
doesn't need completion at all. Exactly because it relies on the additional
fm->lock. ->finish_me could be "task_struct *", the tifm_7xx1_resume() could
simply do schedule_timeout(), tifm_7xx1_isr() could do wake_up_process().
Nevermind, this is off-topic and most probably I misread this code.
> Never the less of course the proposed change in completion_done() was a bug -
> many thanks for catching that so quickly !
OK, perhaps you can ack the fix I sent?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists