[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5b1e7489-df67-cbda-28f2-9d5442e48ce5@huaweicloud.com>
Date: Sat, 30 Jul 2022 10:15:06 +0800
From: "zhangwensheng (E)" <zhangwensheng@...weicloud.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: axboe@...nel.dk, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
yukuai3@...wei.com
Subject: Re: [PATCH -next] [RFC] block: fix null-deref in percpu_ref_put
Hi, Ming
I don't think this is a generic issue in percpu_ref, I sort out some
processes
using percpu_ref like "part->ref", "blkg->refcnt" and
"ctx->reqs/ctx->users",
they all use percpu_ref_exit after "release" done which will not cause
problem.
so I think it should not change it in api(percpu_ref_put_many), and user
should
to guarantee it.
thanks!
Wensheng
在 2022/7/29 21:58, Ming Lei 写道:
> On Fri, Jul 29, 2022 at 06:50:36PM +0800, Zhang Wensheng wrote:
>> From: Zhang Wensheng <zhangwensheng5@...wei.com>
>>
>> A problem was find in stable 5.10 and the root cause of it like below.
>>
>> In the use of q_usage_counter of request_queue, blk_cleanup_queue using
>> "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))"
>> to wait q_usage_counter becoming zero. however, if the q_usage_counter
>> becoming zero quickly, and percpu_ref_exit will execute and ref->data
>> will be freed, maybe another process will cause a null-defef problem
>> like below:
>>
>> CPU0 CPU1
>> blk_cleanup_queue
>> blk_freeze_queue
>> blk_mq_freeze_queue_wait
>> scsi_end_request
>> percpu_ref_get
>> ...
>> percpu_ref_put
>> atomic_long_sub_and_test
>> percpu_ref_exit
>> ref->data -> NULL
>> ref->data->release(ref) -> null-deref
>>
> Looks it is one generic issue in percpu_ref, I think the following patch
> should address it.
>
>
> diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
> index d73a1c08c3e3..07308bd36d83 100644
> --- a/include/linux/percpu-refcount.h
> +++ b/include/linux/percpu-refcount.h
> @@ -331,8 +331,12 @@ static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr)
>
> if (__ref_is_percpu(ref, &percpu_count))
> this_cpu_sub(*percpu_count, nr);
> - else if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count)))
> - ref->data->release(ref);
> + else {
> + percpu_ref_func_t *release = ref->data->release;
> +
> + if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count)))
> + release(ref);
> + }
>
> rcu_read_unlock();
> }
>
>
> Thanks,
> Ming
Powered by blists - more mailing lists