linux-kernel - Re: [RFC PATCH 0/3] blk-mq: rework queue freeze and preempt-only

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1537463217.224533.4.camel@acm.org>
Date:   Thu, 20 Sep 2018 10:06:57 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Jianchao Wang <jianchao.w.wang@...cle.com>, axboe@...nel.dk,
        tj@...nel.org, kent.overstreet@...il.com, ming.lei@...hat.com,
        bart.vanassche@....com
Cc:     linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 0/3] blk-mq: rework queue freeze and preempt-only

On Thu, 2018-09-20 at 18:18 +0800, Jianchao Wang wrote:
> The current queue freeze depending on percpu_ref_kil/reinit has a limit that
> we have drain the requests before unfreeze the queue.
> 
> Let's rework the queue freeze feature as following:
> 1. introduce __percpu_ref_get_many.
>    It is same with original percpu_ref_get_many, but just need callers to provide
>    sched rcu critical section. We will put the __percpu_ref_get_many and our own
>    condition checking under rcu_read_lock_sched. With this new helper interface,
>    we could save an extra rcu_read_lock_sched.
> 2. rework the blk_queue_enter as:
>    
>    rcu_read_lock_sched()
>    if condition check true
>      __percpu_ref_get_many(&q->q_usage_counter, 1)
>    else
>      goto wait
>    rcu_read_unlock_sched()
> 3. use percpu_ref_switch_to_atomic/percpu to switch mode directly.
> 
> Then we could unfreeze the queue w/o draining requests.
> In addition, preempt-only mode code could be simplified.

Hello Jianchao,
                                                                           =
Thanks for having taken a look. However, the approach of this patch series
may be more complicated than necessary. Had you considered something like
the patch below?

Thanks,

Bart.


diff --git a/block/blk-mq.c b/block/blk-mq.c
index 20fdd78b75c7..d384ab700afd 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -199,7 +199,7 @@ void blk_mq_unfreeze_queue(struct request_queue *q)
 	freeze_depth = atomic_dec_return(&q->mq_freeze_depth);
 	WARN_ON_ONCE(freeze_depth < 0);
 	if (!freeze_depth) {
-		percpu_ref_reinit(&q->q_usage_counter);
+		percpu_ref_resurrect(&q->q_usage_counter);
 		wake_up_all(&q->mq_freeze_wq);
 	}
 }
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 009cdf3d65b6..b297cd1cd4f1 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -108,6 +108,7 @@ void percpu_ref_switch_to_atomic_sync(struct percpu_ref *ref);
 void percpu_ref_switch_to_percpu(struct percpu_ref *ref);
 void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
 				 percpu_ref_func_t *confirm_kill);
+void percpu_ref_resurrect(struct percpu_ref *ref);
 void percpu_ref_reinit(struct percpu_ref *ref);
 
 /**
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index 9f96fa7bc000..054e7650cd35 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -356,11 +356,31 @@ EXPORT_SYMBOL_GPL(percpu_ref_kill_and_confirm);
  */
 void percpu_ref_reinit(struct percpu_ref *ref)
 {
+	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+
+	percpu_ref_resurrect(ref);
+}
+EXPORT_SYMBOL_GPL(percpu_ref_reinit);
+
+/**
+ * percpu_ref_resurrect - modify a percpu refcount from dead to live
+ * @ref: perpcu_ref to resurrect
+ *
+ * Modify @ref so that it's in the same state as before percpu_ref_kill() was
+ * called. @ref must have be dead but not exited.
+ *
+ * Note that percpu_ref_tryget[_live]() are safe to perform on @ref while
+ * this function is in progress.
+ */
+void percpu_ref_resurrect(struct percpu_ref *ref)
+{
+	unsigned long __percpu *percpu_count;
 	unsigned long flags;
 
 	spin_lock_irqsave(&percpu_ref_switch_lock, flags);
 
-	WARN_ON_ONCE(!percpu_ref_is_zero(ref));
+	WARN_ON_ONCE(!(ref->percpu_count_ptr & __PERCPU_REF_DEAD));
+	WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
 
 	ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
 	percpu_ref_get(ref);
@@ -368,4 +388,4 @@ void percpu_ref_reinit(struct percpu_ref *ref)
 
 	spin_unlock_irqrestore(&percpu_ref_switch_lock, flags);
 }
-EXPORT_SYMBOL_GPL(percpu_ref_reinit);
+EXPORT_SYMBOL_GPL(percpu_ref_resurrect);