[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ve2gc1bn.fsf@denkblock.local>
Date: Thu, 17 Apr 2008 10:50:20 +0200
From: Elias Oltmanns <eo@...ensachen.de>
To: Tejun Heo <htejun@...il.com>,
James Bottomley <James.Bottomley@...senPartnership.com>,
Jens Axboe <jens.axboe@...cle.com>
Cc: linux-ide@...r.kernel.org, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: Prevent busy looping
Jens Axboe <jens.axboe@...cle.com> wrote:
> On Thu, Apr 17 2008, Elias Oltmanns wrote:
>> Jens Axboe <jens.axboe@...cle.com> wrote:
>> > On Wed, Apr 16 2008, Elias Oltmanns wrote:
>> >> blk_run_queue() as well as blk_start_queue() plug the device on reentry
>> >> and schedule blk_unplug_work() right afterwards. However,
>> >> blk_plug_device() takes care of that already and makes sure that there is
>> >> a short delay before blk_unplug_work() is scheduled. This is important
>> >> to prevent busy looping and possibly system lockups as observed here:
>> >> <http://permalink.gmane.org/gmane.linux.ide/28351>.
>> >
>> > If you call blk_start_queue() and blk_run_queue(), you better mean it.
>> > There should be no delay. The only reason it does blk_plug_device() is
>> > so that the work queue function will actually do some work.
>>
>> Well, I'm mainly concerned with blk_run_queue(). In a comment it says
>> that it should recurse only once so as not to overrun the stack. On my
>> machine, however, immediate rescheduling may have exactly as disastrous
>> consequences as an overrunning stack would have since the system locks
>> up completely.
>>
>> Just to get this straight: Are low level drivers allowed to rely on
>> blk_run_queue() that there will be no loops or do they have to make sure
>> that this function is not called from the request_fn() of the same
>> queue?
>
> It's not really designed for being called recursively. Which isn't the
> problem imo, the problem is SCSI apparently being dumb and calling
> blk_run_queue() all the time. blk_run_queue() must run the queue NOW. If
> SCSI wants something like 'run the queue in a bit', it should use
> blk_plug_device() instead.
James would probably argue that this is alright as long as
max_device_blocked and max_host_blocked are bigger than one.
>
>> > In the newer kernels we just do:
>> >
>> > set_bit(QUEUE_FLAG_PLUGGED, &q->queue_flags);
>> > kblockd_schedule_work(q, &q->unplug_work);
>> >
>> > instead, which is much better.
>>
>> Only as long as it doesn't get called from the request_fn() of the same
>> queue. Otherwise, there may be no chance for other threads to clear the
>> condition that caused blk_run_queue() to be called in the first place.
>
> Broken usage.
Right. Tejun, would it be possible to apply the patch below (2.6.25) or
do you see any alternative?
Regards,
Elias
View attachment "adjust-blocked-counters.patch" of type "text/x-patch" (822 bytes)
Powered by blists - more mailing lists