linux-kernel - Re: blk-mq timeout handling fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <541A03A1.9070908@fb.com>
Date:	Wed, 17 Sep 2014 15:56:49 -0600
From:	Jens Axboe <axboe@...com>
To:	"Elliott, Robert (Server Storage)" <Elliott@...com>,
	Christoph Hellwig <hch@....de>
CC:	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: blk-mq timeout handling fixes

On 09/17/2014 03:53 PM, Elliott, Robert (Server Storage) wrote:
> 
> 
>> -----Original Message-----
>> From: Christoph Hellwig [mailto:hch@....de]
>> Sent: Saturday, 13 September, 2014 6:40 PM
>> To: Jens Axboe
>> Cc: Elliott, Robert (Server Storage); linux-scsi@...r.kernel.org; linux-
>> kernel@...r.kernel.org
>> Subject: blk-mq timeout handling fixes
>>
>> This series fixes various issues with timeout handling that Robert
>> ran into when testing scsi-mq heavily.  He tested an earlier version,
>> and couldn't reproduce the issues anymore, although the series changed
>> quite significantly since and should probably be retested.
>>
>> In summary we not only start the blk-mq timer inside the drivers
>> ->queue_rq method after the request has been fully setup, and we
>> also tell the drivers if we're timing out a reserved (internal)
>> request or a real one.  Many drivers including will need to handle
>> those internal ones differently, e.g. for scsi-mq we don't even
>> have a scsi command structure allocated for the reserved commands.
> 
> I have rerun a variety of tests on:
> * Jens' for-next tree that went into 3.17rc5
> * plus this series
> * plus two patches for infinite recursion on flushes from 
>   Ming and then Christoph

This is pretty much what is queued up for 3.17 as well. It's bigger than
I'd like at this point, but these are real fixes.

> and have not been able to trigger the scsi_times_out req->special
> NULL pointer dereference that prompted this series.

Great!!

> Testing includes:
> * concurrent heavy workload generators:
>   * fio high iodepth direct 512 byte random reads (> 1M IOPS)
>   * programs generating large bursts of paged writes
>     * mkfs.ext4 (followed by e2fsck)
>     * mkfs.xfs (followed by xfs_check)
>     * ddpt
>   * watch -n 0 sync to generate flushes
> * scsi_logging_level MLCOMPLETE set to 0 or 1
>   * scsi_lib.c patched to put all the ACTION_FAIL messages
>     under level 1 so they can be squelched (massive error 
>     prints cause more timeouts themselves)
> * 4 hpsa and 16 mpt3sas devices (all made from SAS SSDs)
>   * lockless hpsa driver
> * injecting errors
>   * device removal
>   * device generating infinite errors
>   * device generating a brief number of errors
> 
> The filesystems don't always recover properly, but nothing in 
> the block or scsi midlayers crashed.
> 
> So, you may add this to the series:
> Tested-by: Robert Elliott <elliott@...com>

Thanks a lot for your (continued) testing, Robert. It's a great help.


-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/