linux-kernel - Re: [PATCH] block, bfq: fix null pointer dereference in bfq_bio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e0d8652-adda-3d39-88cd-d735194d4083@nvidia.com>
Date:   Tue, 8 Nov 2022 22:50:07 +0000
From:   Chaitanya Kulkarni <chaitanyak@...dia.com>
To:     Yu Kuai <yukuai1@...weicloud.com>, "jack@...e.cz" <jack@...e.cz>,
        "tj@...nel.org" <tj@...nel.org>,
        "josef@...icpanda.com" <josef@...icpanda.com>,
        "axboe@...nel.dk" <axboe@...nel.dk>,
        "paolo.valente@...aro.org" <paolo.valente@...aro.org>
CC:     "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "yukuai3@...wei.com" <yukuai3@...wei.com>,
        "yi.zhang@...wei.com" <yi.zhang@...wei.com>
Subject: Re: [PATCH] block, bfq: fix null pointer dereference in
 bfq_bio_bfqg()

On 11/8/22 02:34, Yu Kuai wrote:
> From: Yu Kuai <yukuai3@...wei.com>
> 
> Out test found a following problem in kernel 5.10, and the same problem
> should exist in mainline:
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000094
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP
> CPU: 7 PID: 155 Comm: kworker/7:1 Not tainted 5.10.0-01932-g19e0ace2ca1d-dirty 4
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-b4
> Workqueue: kthrotld blk_throtl_dispatch_work_fn
> RIP: 0010:bfq_bio_bfqg+0x52/0xc0
> Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b
> RSP: 0018:ffffc90001a1fba0 EFLAGS: 00010002
> RAX: ffff888100d60400 RBX: ffff8881132e7000 RCX: 0000000000000000
> RDX: 0000000000000017 RSI: ffff888103580a18 RDI: ffff888103580a18
> RBP: ffff8881132e7000 R08: 0000000000000000 R09: ffffc90001a1fe10
> R10: 0000000000000a20 R11: 0000000000034320 R12: 0000000000000000
> R13: ffff888103580a18 R14: ffff888114447000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff88881fdc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000094 CR3: 0000000100cdb000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   bfq_bic_update_cgroup+0x3c/0x350
>   ? ioc_create_icq+0x42/0x270
>   bfq_init_rq+0xfd/0x1060
>   bfq_insert_requests+0x20f/0x1cc0
>   ? ioc_create_icq+0x122/0x270
>   blk_mq_sched_insert_requests+0x86/0x1d0
>   blk_mq_flush_plug_list+0x193/0x2a0
>   blk_flush_plug_list+0x127/0x170
>   blk_finish_plug+0x31/0x50
>   blk_throtl_dispatch_work_fn+0x151/0x190
>   process_one_work+0x27c/0x5f0
>   worker_thread+0x28b/0x6b0
>   ? rescuer_thread+0x590/0x590
>   kthread+0x153/0x1b0
>   ? kthread_flush_work+0x170/0x170
>   ret_from_fork+0x1f/0x30
> Modules linked in:
> CR2: 0000000000000094
> ---[ end trace e2e59ac014314547 ]---
> RIP: 0010:bfq_bio_bfqg+0x52/0xc0
> Code: 94 00 00 00 00 75 2e 48 8b 40 30 48 83 05 35 06 c8 0b 01 48 85 c0 74 3d 4b
> RSP: 0018:ffffc90001a1fba0 EFLAGS: 00010002
> RAX: ffff888100d60400 RBX: ffff8881132e7000 RCX: 0000000000000000
> RDX: 0000000000000017 RSI: ffff888103580a18 RDI: ffff888103580a18
> RBP: ffff8881132e7000 R08: 0000000000000000 R09: ffffc90001a1fe10
> R10: 0000000000000a20 R11: 0000000000034320 R12: 0000000000000000
> R13: ffff888103580a18 R14: ffff888114447000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff88881fdc0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000094 CR3: 0000000100cdb000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> 
> Root cause is quite complex:
> 
> 1) use bfq elevator for the test device.
> 2) create a cgroup CG
> 3) config blk throtl in CG
> 
>     blkg_conf_prep
>      blkg_create
> 
> 4) create a thread T1 and issue async io in CG:
> 
>     bio_init
>      bio_associate_blkg
>     ...
>     submit_bio
>      submit_bio_noacct
>       blk_throtl_bio -> io is throttled
>       // io submit is done
> 
> 5) switch elevator:
> 
>     bfq_exit_queue
>      blkcg_deactivate_policy
>       list_for_each_entry(blkg, &q->blkg_list, q_node)
>        blkg->pd[] = NULL
>        // bfq policy is removed
> 
> 5) thread t1 exist, then remove the cgroup CG:
> 
>     blkcg_unpin_online
>      blkcg_destroy_blkgs
>       blkg_destroy
>        list_del_init(&blkg->q_node)
>        // blkg is removed from queue list
> 
> 6) switch elevator back to bfq
> 
>   bfq_init_queue
>    bfq_create_group_hierarchy
>     blkcg_activate_policy
>      list_for_each_entry_reverse(blkg, &q->blkg_list)
>       // blkg is removed from list, hence bfq policy is still NULL
> 
> 7) throttled io is dispatched to bfq:
> 
>   bfq_insert_requests
>    bfq_init_rq
>     bfq_bic_update_cgroup
>      bfq_bio_bfqg
>       bfqg = blkg_to_bfqg(blkg)
>       // bfqg is NULL because bfq policy is NULL
> 
> The problem is only possible in bfq because only bfq can be deactivated and
> activated while queue is online, while others can only be deactivated while
> the device is removed.
> 
> Fix the problem in bfq by checking if blkg is online before calling
> blkg_to_bfqg().
> 


Please submit the block tests for this as this is clearly non-trivial
issue that needs to be tested every release ..

-ck