linux-kernel - Re: Perfromance drop on SCSI hard disk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 13 May 2011 08:11:43 +0800
From:	"Alex,Shi" <alex.shi@...el.com>
To:	Jens Axboe <jaxboe@...ionio.com>
Cc:	"James.Bottomley@...senpartnership.com" 
	<James.Bottomley@...senpartnership.com>,
	"Li, Shaohua" <shaohua.li@...el.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Perfromance drop on SCSI hard disk

On Fri, 2011-05-13 at 04:29 +0800, Jens Axboe wrote:
> On 2011-05-10 08:40, Alex,Shi wrote:
> > commit c21e6beba8835d09bb80e34961 removed the REENTER flag and changed
> > scsi_run_queue() to punt all requests on starved_list devices to
> > kblockd. Yes, like Jens mentioned, the performance on slow SCSI disk was
> > hurt here.  :) (Intel SSD isn't effected here)
> > 
> > In our testing on 12 SAS disk JBD, the fio write with sync ioengine drop
> > about 30~40% throughput, fio randread/randwrite with aio ioengine drop
> > about 20%/50% throughput. and fio mmap testing was hurt also. 
> > 
> > With the following debug patch, the performance can be totally recovered
> > in our testing. But without REENTER flag here, in some corner case, like
> > a device is keeping blocked and then unblocked repeatedly,
> > __blk_run_queue() may recursively call scsi_run_queue() and then cause
> > kernel stack overflow. 
> > I don't know details of block device driver, just wondering why on scsi
> > need the REENTER flag here. :) 
> 
> This is a problem and we should do something about it for 2.6.39. I knew
> that there would be cases where the async offload would cause a
> performance degredation, but not to the extent that you are reporting.
> Must be hitting the pathological case.
> 
> I can think of two scenarios where it could potentially recurse:
> 
> - request_fn enter, end up requeuing IO. Run queue at the end. Rinse,
>   repeat.
> - Running starved list from request_fn, two (or more) devices could
>   alternately recurse.
> 
> The first case should be fairly easy to handle. The second one is
> already handled by the local list splice.
> 
> Looking at the code, is this a real scenario? Only potential recurse I
> see is:
> 
> scsi_request_fn()
>         scsi_dispatch_cmd()
>                 scsi_queue_insert()
>                         __scsi_queue_insert()
>                                 scsi_run_queue()
> 
> Why are we even re-running the queue immediately on a BUSY condition?
> Should only be needed if we have zero pending commands from this
> particular queue, and for that particular case async run is just fine
> since it's a rare condition (or performance would suck already).

Yeah, this is correct way to fix it. Let me try the patch on our
machine. 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/