lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b3616662-7181-f968-97e6-248c361edd85@red-soft.ru>
Date:   Thu, 12 Jan 2023 14:40:56 +0300
From:   Anton Fadeev <anton.fadeev@...-soft.ru>
To:     Yu Kuai <yukuai1@...weicloud.com>,
        Artem Chernyshev <artem.chernyshev@...-soft.ru>,
        Jan Kara <jack@...e.cz>
Cc:     Paolo Valente <paolo.valente@...aro.org>,
        Jens Axboe <axboe@...nel.dk>, Tejun Heo <tj@...nel.org>,
        cgroups@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, lvc-project@...uxtesting.org,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH] block: bfq fix null pointer dereference of bfqg in
 bfq_bio_bfqg()



12.01.2023 14:30, Yu Kuai пишет:
> CC Jan.
> 
> 在 2023/01/12 19:24, Artem Chernyshev 写道:
>> Hi,
>> On Thu, Jan 12, 2023 at 07:09:10PM +0800, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2023/01/12 17:43, Artem Chernyshev 写道:
>>>> It is possible for bfqg to be NULL after being initialized as result of
>>>> blkg_to_bfqg() function.
>>>>
>>>> That was achieved on kernel 5.15.78, but should exist in mainline as
>>>> well
>>>
>>> The problem is already fixed in mainline by following patch:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f02be9002c480cd3ec0fcf184ad27cf531bd6ece
>>>
>>> Thanks,
>>> Kuai
>>>>
>>>> host1 login: [ 460.855794] watchdog: watchdog0: watchdog did not stop!
>>>> [  898.944512] BUG: kernel NULL pointer dereference, address: 
>>>> 0000000000000094
>>>> [  899.285776] #PF: supervisor read access in kernel mode
>>>> [  899.536511] #PF: error_code(0x0000) - not-present page
>>>> [  899.647305]  connection4:0: detected conn error (1020)
>>>> [  899.786794] PGD 0 P4D 0
>>>> [  899.786799] Oops: 0000 [#1] SMP PTI
>>>> [  899.786802] CPU: 15 PID: 6073 Comm: ID iothread1 Not tainted 
>>>> 5.15.78-1.el7virt.x86_64 #1
>>>> [  899.786804] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 
>>>> Gen9, BIOS P89 10/21/2019
>>>> [  899.786806] RIP: 0010:bfq_bio_bfqg+0x26/0x80
>>>> [  901.325944] Code: 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 89 f7 
>>>> 53 48 8b 56 48 48 85 d2
>>>> 74 32 48 63 05 83 7f 35 01 48 83 c0 16 48 8b 5c c2 08 <80> bb 94 00 
>>>> 00 00 00 00
>>>> [  902.237825] RSP: 0018:ffffae2649437688 EFLAGS: 00010002
>>>> [  902.493396] RAX: 0000000000000019 RBX: 0000000000000000 RCX: 
>>>> dead000000000122
>>>> [  902.841529] RDX: ffff8b6012cb3a00 RSI: ffff8b71002bbed0 RDI: 
>>>> ffff8b71002bbed0
>>>> [  903.189374] RBP: ffff8b601c46e800 R08: ffffae26494377c8 R09: 
>>>> 0000000000000000
>>>> [  903.532985] R10: 0000000000000001 R11: 0000000000000008 R12: 
>>>> ffff8b6f844c5b30
>>>> [  903.880809] R13: ffff8b601c46e800 R14: ffffae2649437760 R15: 
>>>> ffff8b601c46e800
>>>> [  904.220054] FS:  00007fec2fc4a700(0000) GS:ffff8b7f7f640000(0000) 
>>>> kn1GS:00000000000000000
>>>> [  904.614349] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [  904.894717] CR2: 0000000000000094 CR3: 0000000111fd8002 CR4: 
>>>> 00000000003726e0
>>>> [  905.243702] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>>>> 0000000000000000
>>>> [  905.592493] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
>>>> 0000000000000400
>>>> [  905.936859] Call Trace:
>>>> [  906.055955] <TASK>
>>>> [  906.158109] bfq_bic_update_cgroup+0x2c/0x1f0
>>>> [  906.371057] bfq_insert_requests+0x2c2/0x1fb0
>>>> [  906.579207] blk_mq_sched_insert_request+0xc2/0x140
>>>> [  906.817640] __blk_mq_try_issue_directly+0xe0/0x1f0
>>>> [  907.055737] blk_mq_request_issue_directly+0x4e/0xa0
>>>> [  907.298547] dm_mq_queue_rq+0x217/0x3e0
>>>> [  907.485935] blk_mq_dispatch_rq_list+0x14b/0x860
>>>> [  907.711973] ? sbitmap_get+0x87/0x1a0
>>>> [  907.890370] blk_mq_do_dispatch_sched+0x350/0x3b0
>>>> [  908.074869] NMI watchdog: Watchdog detected hard LOCKUP on cpu 40
>>>>
>>>> Fixes: 075a53b78b81 ("bfq: Make sure bfqg for which we are queueing 
>>>> requests is online")
>>>> Co-developed-by: Anton Fadeev <anton.fadeev@...-soft.ru>
>>>> Signed-off-by: Anton Fadeev <anton.fadeev@...-soft.ru>
>>>> Signed-off-by: Artem Chernyshev <artem.chernyshev@...-soft.ru>
>>>> ---
>>>>    block/bfq-cgroup.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
>>>> index 1b2829e99dad..d4e9428cdbe5 100644
>>>> --- a/block/bfq-cgroup.c
>>>> +++ b/block/bfq-cgroup.c
>>>> @@ -616,7 +616,7 @@ struct bfq_group *bfq_bio_bfqg(struct bfq_data 
>>>> *bfqd, struct bio *bio)
>>>>                continue;
>>>>            }
>>>>            bfqg = blkg_to_bfqg(blkg);
>>>> -        if (bfqg->online) {
>>>> +        if (bfqg && bfqg->online) {
>>>>                bio_associate_blkg_from_css(bio, &blkg->blkcg->css);
>>>>                return bfqg;
>>>>            }
>>>>
>>
>> Sorry, forgot to mention, what behaviour was the same after we applied 
>> this patch. Issue
>> was resolved only when we added NULL checking for bfqg.
> 
> So, you mean that blkg is still online, while blkg_to_bfqg() return
> NULL. Can you explan how this is possible? I can't figure out how this
> is possible...
> 
> Thanks,
> Kuai
>>
>> Thanks,
>> Artem
>>
>> .
>>
> 

I'll try to describe. We have a virtualization cluster based on oVirt 
4.4 with kernel 5.15.78 with patch you mentioned, there is four nodes in 
it. Also we have an ISCSI target, that is connected to all nodes. LUNs 
are connected in mixed mode, such as Storage Domains and direct LUNs to 
VMs. When all VMs are in UP state, we simply disconnect the ISCSI target 
from network or shutdown target service,  then we got the described bug.

Thanks,
Anton.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ