lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 13 Apr 2019 08:36:54 +0800
From:   Bob Liu <bob.liu@...cle.com>
To:     Jinpu Wang <jinpuwang@...il.com>, r.peniaev@...il.com
Cc:     linux-block@...r.kernel.org, shirley.ma@...cle.com,
        "Martin K. Petersen" <martin.petersen@...cle.com>,
        Akinobu Mita <akinobu.mita@...il.com>,
        Tejun Heo <tj@...nel.org>, Jens Axboe <axboe@...nel.dk>,
        Christoph Hellwig <hch@....de>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [RESEND PATCH] blk-mq: fix hang caused by freeze/unfreeze
 sequence

On 4/9/19 5:29 PM, Jinpu Wang wrote:
> Bob Liu <bob.liu@...cle.com> 于2019年4月9日周二 上午11:11写道:
>>
>> This patch was proposed by Roman Pen[3] years ago.
>> Recently we hit a bug which is likely caused by the same reason,so rebased his
>> fix to v5.1 and resend.
>> Below is almost copied from that patch[3].
>>
>> ------
>> Long time ago there was a similar fix proposed by Akinobu Mita[1],
>> but it seems that time everyone decided to fix this subtle race in
>> percpu-refcount and Tejun Heo[2] did an attempt (as I can see that
>> patchset was not applied).
>>
>> The following is a description of a hang in blk_mq_freeze_queue_wait() -
>> same fix but a bug from another angle.
>>
>> The hang happens on attempt to freeze a queue while another task does
>> queue unfreeze.
>>
>> The root cause is an incorrect sequence of percpu_ref_reinit() and
>> percpu_ref_kill() and as a result those two can be swapped:
>>
>>  CPU#0               CPU#1
>>  ----------------    -----------------
>>  percpu_ref_kill()
>>
>>                      percpu_ref_kill() << atomic reference does
>>  percpu_ref_reinit()                   << not guarantee the order
>>
>>                      blk_mq_freeze_queue_wait() << HANG HERE
>>
>>                      percpu_ref_reinit()
>>
>> Firstly this wrong sequence raises two kernel warnings:
>>
>>   1st. WARNING at lib/percpu-recount.c:309
>>        percpu_ref_kill_and_confirm called more than once
>>
>>   2nd. WARNING at lib/percpu-refcount.c:331
>>
>> But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(),
>> which waits for a zero of a q_usage_counter, which never happens
>> because percpu-ref was reinited (instead of being killed) and stays in
>> PERCPU state forever.
>>
>> The simplified sequence above can be reproduced on shared tags, when
>> queue A is going to die meanwhile another queue B is in init state and
>> is trying to freeze the queue A, which shares the same tags set:
>>
>>  CPU#0                           CPU#1
>>  ------------------------------- ------------------------------------
>>  q1 = blk_mq_init_queue(shared_tags)
>>
>>                                 q2 = blk_mq_init_queue(shared_tags):
>>                                   blk_mq_add_queue_tag_set(shared_tags):
>>                                     blk_mq_update_tag_set_depth(shared_tags):
>>                                       blk_mq_freeze_queue(q1)
>>  blk_cleanup_queue(q1)                 ...
>>    blk_mq_freeze_queue(q1)   <<<->>>   blk_mq_unfreeze_queue(q1)
>>
>> [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@...il.com
>> [2] Message id: 1443563240-29306-6-git-send-email-tj@...nel.org
>> [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9268199_&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=OcA07QqFechuCug2pqm_-JpGP_mOt0YouTXApdePMGw&s=VM_-8S5gkFo8zUjT5RoY0CkbxN6hQmTwVmslulwsFJM&e=
>>
>> Signed-off-by: Roman Pen <roman.penyaev@...fitbricks.com>
>> Signed-off-by: Bob Liu <bob.liu@...cle.com>
>> Cc: Akinobu Mita <akinobu.mita@...il.com>
>> Cc: Tejun Heo <tj@...nel.org>
>> Cc: Jens Axboe <axboe@...nel.dk>
>> Cc: Christoph Hellwig <hch@....de>
>> Cc: linux-block@...r.kernel.org
>> Cc: linux-kernel@...r.kernel.org
>>
> 
> Replaced Roman's email address.
> 
> We at 1 & 1 IONOS (former ProfitBricks) have been carried this patch
> for some years,
> it has been running in production for some years too,

Nice to hear that!

> would be good to see it in upstream :)

Yes.
Could anyone have a review? Thanks!

> 
> Thanks,
> 
> Jack Wang
> Linux Kernel Developer @ 1 & 1 IONOS
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ