[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YuPnjI8oHx4dO3nr@T590>
Date: Fri, 29 Jul 2022 21:58:36 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Zhang Wensheng <zhangwensheng@...weicloud.com>
Cc: axboe@...nel.dk, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org,
yukuai3@...wei.com
Subject: Re: [PATCH -next] [RFC] block: fix null-deref in percpu_ref_put
On Fri, Jul 29, 2022 at 06:50:36PM +0800, Zhang Wensheng wrote:
> From: Zhang Wensheng <zhangwensheng5@...wei.com>
>
> A problem was find in stable 5.10 and the root cause of it like below.
>
> In the use of q_usage_counter of request_queue, blk_cleanup_queue using
> "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))"
> to wait q_usage_counter becoming zero. however, if the q_usage_counter
> becoming zero quickly, and percpu_ref_exit will execute and ref->data
> will be freed, maybe another process will cause a null-defef problem
> like below:
>
> CPU0 CPU1
> blk_cleanup_queue
> blk_freeze_queue
> blk_mq_freeze_queue_wait
> scsi_end_request
> percpu_ref_get
> ...
> percpu_ref_put
> atomic_long_sub_and_test
> percpu_ref_exit
> ref->data -> NULL
> ref->data->release(ref) -> null-deref
>
Looks it is one generic issue in percpu_ref, I think the following patch
should address it.
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index d73a1c08c3e3..07308bd36d83 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -331,8 +331,12 @@ static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr)
if (__ref_is_percpu(ref, &percpu_count))
this_cpu_sub(*percpu_count, nr);
- else if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count)))
- ref->data->release(ref);
+ else {
+ percpu_ref_func_t *release = ref->data->release;
+
+ if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count)))
+ release(ref);
+ }
rcu_read_unlock();
}
Thanks,
Ming
Powered by blists - more mailing lists