linux-kernel - Re: [PATCH 1/1] blk-mq: get ctx in order to handle BLK_MQ_S_INACTIVE in blk_mq_get

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <39dd930b-839d-5d74-afb6-5a8bb1d2f0da@oracle.com>
Date:   Wed, 3 Jun 2020 09:23:45 -0700
From:   Dongli Zhang <dongli.zhang@...cle.com>
To:     John Garry <john.garry@...wei.com>, linux-block@...r.kernel.org
Cc:     axboe@...nel.dk, hare@...e.de, dwagner@...e.de,
        ming.lei@...hat.com, linux-kernel@...r.kernel.org,
        Christoph Hellwig <hch@....de>
Subject: Re: [PATCH 1/1] blk-mq: get ctx in order to handle BLK_MQ_S_INACTIVE
 in blk_mq_get_tag()

Hi John,

On 6/3/20 4:59 AM, John Garry wrote:
> On 02/06/2020 07:17, Dongli Zhang wrote:
>> When scheduler is set, we hit below page fault when we offline cpu.
>>
>> [ 1061.007725] BUG: kernel NULL pointer dereference, address: 0000000000000040
>> [ 1061.008710] #PF: supervisor read access in kernel mode
>> [ 1061.009492] #PF: error_code(0x0000) - not-present page
>> [ 1061.010241] PGD 0 P4D 0
>> [ 1061.010614] Oops: 0000 [#1] SMP PTI
>> [ 1061.011130] CPU: 0 PID: 122 Comm: kworker/0:1H Not tainted 5.7.0-rc7+ #2'
>> ... ...
>> [ 1061.013760] Workqueue: kblockd blk_mq_run_work_fn
>> [ 1061.014446] RIP: 0010:blk_mq_put_tag+0xf/0x30
>> ... ...
>> [ 1061.017726] RSP: 0018:ffffa5c18037fc70 EFLAGS: 00010287
>> [ 1061.018475] RAX: 0000000000000000 RBX: ffffa5c18037fcf0 RCX: 0000000000000004
>> [ 1061.019507] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff911535dc1180
>> ... ...
>> [ 1061.028454] Call Trace:
>> [ 1061.029307]  blk_mq_get_tag+0x26e/0x280
>> [ 1061.029866]  ? wait_woken+0x80/0x80
>> [ 1061.030378]  blk_mq_get_driver_tag+0x99/0x110
>> [ 1061.031009]  blk_mq_dispatch_rq_list+0x107/0x5e0
>> [ 1061.031672]  ? elv_rb_del+0x1a/0x30
>> [ 1061.032178]  blk_mq_do_dispatch_sched+0xe2/0x130
>> [ 1061.032844]  __blk_mq_sched_dispatch_requests+0xcc/0x150
>> [ 1061.033638]  blk_mq_sched_dispatch_requests+0x2b/0x50
>> [ 1061.034239]  __blk_mq_run_hw_queue+0x75/0x110
>> [ 1061.034867]  process_one_work+0x15c/0x370
>> [ 1061.035450]  worker_thread+0x44/0x3d0
>> [ 1061.035980]  kthread+0xf3/0x130
>> [ 1061.036440]  ? max_active_store+0x80/0x80
>> [ 1061.037018]  ? kthread_bind+0x10/0x10
>> [ 1061.037554]  ret_from_fork+0x35/0x40
>> [ 1061.038073] Modules linked in:
>> [ 1061.038543] CR2: 0000000000000040
>> [ 1061.038962] ---[ end trace d20e1df7d028e69f ]---
>>
>> This is because blk_mq_get_driver_tag() would be used to allocate tag once
>> scheduler (e.g., mq-deadline) is set. 
> 
> I tried mq-deadline and I did not see this. Anyway else special or specific
> about your test?
> 

I think you just hit the issue as mentioned in another thread.

To reproduce the issue reproduce to hit the condition that:

1. blk_mq_hctx_notify_offline() BLK_MQ_S_INACTIVE with the barrier ...

... while ...

2. blk_mq_get_tag() gets the tag but BLK_MQ_S_INACTIVE is already set.
Therefore, it would put the tag to release it.

Dongli Zhang