[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5eb1eeaa-b148-7b90-eee4-e365ead49a5b@oracle.com>
Date: Wed, 23 May 2018 11:55:29 +0800
From: "jianchao.wang" <jianchao.w.wang@...cle.com>
To: qla2xxx-upstream@...gic.com, himanshu.madhani@...ium.com,
jthumshirn@...e.de
Cc: linux-scsi@...r.kernel.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"jejb@...ux.vnet.ibm.com" <jejb@...ux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Junxiao Bi <junxiao.bi@...cle.com>
Subject: BUG: scsi/qla2xxx: BUG_ON(blk_queued_rq(req) is triggered in
blk_finish_request
Hi all
Our customer met a panic triggered by BUG_ON in blk_finish_request.
>From the dmesg log, the BUG_ON was triggered after command abort occurred many times.
There is a race condition in the following scenario.
cpu A cpu B
kworker interrupt
scmd_eh_abort_handler()
-> scsi_try_to_abort_cmd()
-> qla2xxx_eh_abort()
-> qla2x00_eh_wait_on_command() qla2x00_status_entry()
-> qla2x00_sp_compl()
-> qla2x00_sp_free_dma()
-> scsi_queue_insert()
-> __scsi_queue_insert()
-> blk_requeue_request()
-> blk_clear_rq_complete()
-> scsi_done
-> blk_complete_request
-> blk_mark_rq_complete
-> elv_requeue_request() -> __blk_complete_request()
-> __elv_add_request()
// req will be queued here
BLK_SOFTIRQ
scsi_softirq_done()
-> scsi_finish_command()
-> scsi_io_completion()
-> scsi_end_request()
-> blk_finish_request() // BUG_ON(blk_queued_rq(req)) !!!
The issue will not be triggered most of time, because the request is marked as complete by timeout path.
So the scsi_done from qla2x00_sp_compl does nothing.
But as the scenario above, if the complete state has been cleaned by blk_requeue_request, we will get
the request both requeued and completed, and then BUG_ON(blk_queued_rq(req)) in blk_finish_request comes up.
Is there any solution for this in qla2xxx driver side ?
Thanks
Jianchao
Powered by blists - more mailing lists