[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070806153655.GA245@tv-sign.ru>
Date: Mon, 6 Aug 2007 19:36:55 +0400
From: Oleg Nesterov <oleg@...sign.ru>
To: Gregory Haskins <ghaskins@...ell.com>
Cc: Daniel Walker <dwalker@...sta.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...e.hu>, linux-rt-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] RT: Add priority-queuing and priority-inheritance to workqueue infrastructure
On 08/06, Gregory Haskins wrote:
>
> On Mon, 2007-08-06 at 18:26 +0400, Oleg Nesterov wrote:
>
> > Immediately, A inserts the work on CPU 1.
>
> Well, if you didn't care about which CPU, that's true. But suppose we
> want to direct this specifically at CWQ for cpu 0.
please see below...
> > It can livelock if other higher-priority threads add works to this wq
> > repeteadly.
>
> That's really starvation, though. I agree with you that it is
> technically possible to happen if the bandwidth of the higher priority
> producer task is greater than the bandwidth of the consumer. But note
> that the RT priorities already forgo starvation avoidance, so IMO the
> queue can here as well. E.g. if the system designer runs a greedy app
> at a high RT priority, it can starve as many lower priority items as it
> wants anyway. This is no different.
OK.
> > > > Actually a niced (so that its priority is lower than cwq->thread's one)
> > > > can deadlock. Suppose it does
> > > >
> > > > lock(LOCK);
> > > > flush_workueue(wq);
> > > >
> > > > and we have a pending work_struct which does:
> > > >
> > > > void work_handler(struct work_struct *self)
> > > > {
> > > > if (!try_lock(LOCK)) {
> > > > // try again later...
> > > > queue_work(wq, self);
> > > > return;
> > > > }
> > > >
> > > > do_something();
> > > > }
> > > >
> > > > Deadlock.
> > >
> > > That code is completely broken, so I don't think it matters much. But
> > > regardless, the new API changes will address that.
> >
> > Sorry. This code is not very nice, but it is correct currently.
>
> Well, the "trylock+requeue" avoids the obvious recursive deadlock, but
> it introduces a more subtle error: the reschedule effectively bypasses
> the flush.
this is OK, flush_workqueue() should only care about work_struct's that are
already queued.
> E.g. whatever work was being flushed was allowed to escape
> out from behind the barrier. If you don't care about the flush working,
> why do it at all?
The caller of flush_workueue() doesn't necessary know we have such a work
on list. It just wants to flush its own works.
> But like I said, its moot...we will fix that condition at the API level
> (or move away from overloading the workqueues)
Oh, good :)
> > Well, if we can use smp_call_() we don't need these complications?
>
> With an RT99 or smp_call() solution, all invocations effectively preempt
> whatever is running on the target CPU. That is why I said it was the
> opposite problem. A low priority client can preempt a high-priority
> task, which is just as bad as the current situation.
Aha, now I see what another problem you are trying to solve. I had a false
impression that might_sleep() is the issue.
After reading the couple of Peter's emails, I guess I am starting to
understand another issue. RT has irq threads, and they have different
priorities. So, in that case I agree, it is natural to consider the work
which was queued from the higher-priority irq thread as "more important".
Yes?
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists