linux-kernel - Re: [PATCH -next 0/4] blk-cgroup: synchronize del

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e01daffe-a3e3-8bf2-40ee-192a9e70d911@huaweicloud.com>
Date:   Tue, 20 Dec 2022 17:19:12 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Tejun Heo <tj@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>
Cc:     hch@...radead.org, josef@...icpanda.com, axboe@...nel.dk,
        cgroups@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, yi.zhang@...wei.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next 0/4] blk-cgroup: synchronize del_gendisk() with
 configuring cgroup policy

Hi,

在 2022/12/20 4:55, Tejun Heo 写道:
> Hello,
> 
> On Sat, Dec 17, 2022 at 11:09:04AM +0800, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> iocost is initialized when it's configured the first time, and iocost
>> initializing can race with del_gendisk(), which will cause null pointer
>> dereference:
>>
>> t1				t2
>> ioc_qos_write
>>   blk_iocost_init
>>    rq_qos_add
>>    				del_gendisk
>>    				 rq_qos_exit
>>    				 //iocost is removed from q->roqs
>>    blkcg_activate_policy
>>     pd_init_fn
>>      ioc_pd_init
>>       ioc = q_to_ioc(blkg->q)
>>       //can't find iocost and return null
>>
>> And iolatency is about to switch to the same lazy initialization.
>>
>> This patchset fix this problem by synchronize rq_qos_add() and
>> blkcg_activate_policy() with rq_qos_exit().
> 
> So, the patchset seems a bit overly complicated to me. What do you think
> about the following?
> 
> * These init/exit paths are super cold path, just protecting them with a
>    global mutex is probably enough. If we encounter a scalability problem,
>    it's easy to fix down the line.
> 
> * If we're synchronizing this with a mutex anyway, no need to grab the
>    queue_lock, right? rq_qos_add/del/exit() can all just hold the mutex.
> 
> * And we can keep the state tracking within rq_qos. When rq_qos_exit() is
>    called, mark it so that future adds will fail - be that a special ->next
>    value a queue flag or whatever.

Yes, that sounds good. BTW, queue_lock is also used to protect
pd_alloc_fn/pd_init_fn，and we found that blkcg_activate_policy() is
problematic:

blkcg_activate_policy
  spin_lock_irq(&q->queue_lock);
  list_for_each_entry_reverse(blkg, &q->blkg_list
   pd_alloc_fn(GFP_NOWAIT | __GFP_NOWARN,...) -> failed

   spin_unlock_irq(&q->queue_lock);
   // release queue_lock here is problematic, this will cause
pd_offline_fn called without pd_init_fn.
   pd_alloc_fn(__GFP_NOWARN,...)

If we are using a mutex to protect rq_qos ops, it seems the right thing
to do do also using the mutex to protect blkcg_policy ops, and this
problem can be fixed because mutex can be held to alloc memroy with
GFP_KERNEL. What do you think?

Thanks,
Kuai
> 
> Thanks.
>