linux-kernel - Re: [PATCH -next 0/4] blk-cgroup: synchronize del

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c2fecf04-6e9e-25a1-f14c-47862837cc54@huaweicloud.com>
Date:   Wed, 21 Dec 2022 09:10:33 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Tejun Heo <tj@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>
Cc:     hch@...radead.org, josef@...icpanda.com, axboe@...nel.dk,
        cgroups@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, yi.zhang@...wei.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next 0/4] blk-cgroup: synchronize del_gendisk() with
 configuring cgroup policy

Hi,

在 2022/12/21 0:01, Tejun Heo 写道:
> Hello,
> 
> On Tue, Dec 20, 2022 at 05:19:12PM +0800, Yu Kuai wrote:
>> Yes, that sounds good. BTW, queue_lock is also used to protect
>> pd_alloc_fn/pd_init_fn，and we found that blkcg_activate_policy() is
>> problematic:
>>
>> blkcg_activate_policy
>>   spin_lock_irq(&q->queue_lock);
>>   list_for_each_entry_reverse(blkg, &q->blkg_list
>>    pd_alloc_fn(GFP_NOWAIT | __GFP_NOWARN,...) -> failed
>>
>>    spin_unlock_irq(&q->queue_lock);
>>    // release queue_lock here is problematic, this will cause
>> pd_offline_fn called without pd_init_fn.
>>    pd_alloc_fn(__GFP_NOWARN,...)
> 
> So, if a blkg is destroyed while a policy is being activated, right?
Yes, remove cgroup can race with this, for bfq null pointer deference
will be triggered in bfq_pd_offline().

> 
>> If we are using a mutex to protect rq_qos ops, it seems the right thing
>> to do do also using the mutex to protect blkcg_policy ops, and this
>> problem can be fixed because mutex can be held to alloc memroy with
>> GFP_KERNEL. What do you think?
> 
> One worry is that switching to mutex can be more headache due to destroy
> path synchronization. Another approach would be using a per-blkg flag to
> track whether a blkg has been initialized.
I think perhaps you mean per blkg_policy_data flag? per blkg flag should
not work in this case.

Thanks,
Kuai
> 
> Thanks.
>