[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150828160637.GA4393@redhat.com>
Date: Fri, 28 Aug 2015 18:06:37 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
David Howells <dhowells@...hat.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: wake_up_process implied memory barrier clarification
On 08/28, Michal Hocko wrote:
>
> On Thu 27-08-15 20:26:54, Oleg Nesterov wrote:
> > On 08/27, Michal Hocko wrote:
> > >
> > > --- a/Documentation/memory-barriers.txt
> > > +++ b/Documentation/memory-barriers.txt
> > > @@ -2031,6 +2031,9 @@ something up. The barrier occurs before the task state is cleared, and so sits
> > > <general barrier> STORE current->state
> > > LOAD event_indicated
> > >
> > > +Please note that wake_up_process is an exception here because it implies
> > > +the write memory barrier unconditionally.
> > > +
> >
> > I simply can't understand (can't even parse) this part of memory-barriers.txt.
>
> Do you mean the added text or the example above it?
Both ;)
but note that "load from X might see 0" is true of course, and in this
sense wake_up_process() is not exception:
X = 1;
wmb(); // unless I am totally confused this just adds more confusion
Y = 1;
wake_up_process(TASK);
vs TASK doing
for (;;) {
set_current_state(...);
if (Y)
break;
schedule();
}
BUG_ON(X == 0)
is not correct, it can actually can hit the BUG_ON() above. However, if
wake_up_process() actually wakes a sleeping TASK up, then it should also
see X = 1. Even without wmb(), even if we do
Y = 1;
X = 1;
wake_up_process(TASK);
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -1967,8 +1967,7 @@ static void try_to_wake_up_local(struct task_struct *p)
> > > *
> > > * Return: 1 if the process was woken up, 0 if it was already running.
> > > *
> > > - * It may be assumed that this function implies a write memory barrier before
> > > - * changing the task state if and only if any tasks are woken up.
> > > + * It may be assumed that this function implies a write memory barrier.
> > > */
> >
> > I won't argue, technically this is correct of course. And I agree that
> > the old comment is misleading.
>
> Well the reason I've noticed is the following race in the scsi code
> CPU0 CPU1
> scsi_error_handler scsi_host_dev_release
> kthread_stop()
> while (!kthread_should_stop()) {
> set_bit(KTHREAD_SHOULD_STOP)
> wake_up_process()
> wait_for_completion()
>
> set_current_state(TASK_INTERRUPTIBLE)
> schedule()
Heh. This looks like a common mistake. See fecdf8be2d91e04b0a9a4f79ff06499 ;)
But I believe this is another thing.
> I have read the comment for wake_up_process and was wondering that
> moving set_current_state before kthread_should_stop wouldn't be enough
> because the the task at CPU0 might be TASK_RUNNIG and so wake_up_process
> wouldn't wake up it and the missing write barrier could lead to a missed
> KTHREAD_SHOULD_STOP.
And that is why try_to_wake_up()->smp_mb__before_spinlock() needs to
serialize STORE(CONDITION) and the subsequent LOAD(p->state). The fact
that it actually does wmb() is just implementation detail, that is what
I tried to say.
> > To me, this comment should just explain that this function implies a barrier
> > but only in a sense that you do not need another one after CONDITION = T and
> > before wake_up_process().
>
> I have no objection against more precise wording here but what we have is just
> misleading.
Yes, yes, I agree. Just I do not know what exactly it should document.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists