[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <93193221-fbad-444f-c325-9f19d4c5931b@huawei.com>
Date: Tue, 12 Oct 2021 09:09:03 +0800
From: "yukuai (C)" <yukuai3@...wei.com>
To: Michal Koutný <mkoutny@...e.com>
CC: <tj@...nel.org>, <axboe@...nel.dk>, <cgroups@...r.kernel.org>,
<linux-block@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<yi.zhang@...wei.com>
Subject: Re: [PATCH] blk-cgroup: check blkcg policy is enabled in
blkg_create()
On 2021/10/11 23:23, Michal Koutný wrote:
> Hello.
>
> On Fri, Oct 08, 2021 at 03:27:20PM +0800, Yu Kuai <yukuai3@...wei.com> wrote:
>> This is because blkg_alloc() is called from blkg_conf_prep() without
>> holding 'q->queue_lock', and elevator is exited before blkg_create():
>
> IIUC the problematic interleaving is this one (I've noticed `blkg->pd[i]
> = NULL` to thread 2 call trace):
The new blkg will not add to blkg_list untill pd_init_fn() is done in
blkg_create(), thus blkcg_deactivate_policy() can't access this blkg.
>
>> thread 1 thread 2
>> blkg_conf_prep
>> spin_lock_irq(&q->queue_lock);
>> blkg_lookup_check -> return NULL
>> spin_unlock_irq(&q->queue_lock);
>>
>> blkg_alloc
>> blkcg_policy_enabled -> true
>> pd = ->pd_alloc_fn
>> blk_mq_exit_sched
>> bfq_exit_queue
>> blkcg_deactivate_policy
>> spin_lock_irq(&q->queue_lock);
>> __clear_bit(pol->plid, q->blkcg_pols);
>>
> pol->pd_free_fn(blkg->pd[i]);
> blkg->pd[i] = NULL;
>>
>> spin_unlock_irq(&q->queue_lock);
>> q->elevator = NULL;
> blkg->pd[i] = pd
>> spin_lock_irq(&q->queue_lock);
>> blkg_create
>> if (blkg->pd[i])
>> ->pd_init_fn -> q->elevator is NULL
>> spin_unlock_irq(&q->queue_lock);
>
> In high-level terms, is this a race between (blk)io controller attribute
> write and a device scheduler (elevator) switch?
> If so, I'd add it to the commit message.
>
>> Fix the problem by checking that policy is still enabled in
>> blkg_create().
>
> Is this sufficient wrt some other q->elevator users later?
>
>> @@ -252,6 +266,9 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
>> goto err_free_blkg;
>> }
>>
>
> I'd add a comment here like:
>
>> Re-check policies are still enabled, since the caller blkg_conf_prep()
>> temporarily drops q->queue_lock and we can race with
>> blk_mq_exit_sched() removing policies.
Thanks for your advice.
Best regards,
Kuai
>
>> + if (new_blkg)
>> + blkg_check_pd(q, new_blkg);
>> +
>
> Thanks,
> Michal
> .
>
Powered by blists - more mailing lists