lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170831061020.4426-2-paolo.valente@linaro.org>
Date:   Thu, 31 Aug 2017 08:10:18 +0200
From:   Paolo Valente <paolo.valente@...aro.org>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        ulf.hansson@...aro.org, broonie@...nel.org,
        mgorman@...hsingularity.net, lee.tibbert@...il.com,
        oleksandr@...alenko.name, Paolo Valente <paolo.valente@...aro.org>
Subject: [PATCH BUGFIX/IMPROVEMENT 1/3] block, bfq: make lookup_next_entity push up vtime on expirations

To provide a very smooth service, bfq starts to serve a bfq_queue
only if the queue is 'eligible', i.e., if the same queue would
have started to be served in the ideal, perfectly fair system that
bfq simulates internally. This is obtained by associating each
queue with a virtual start time, and by computing a special system
virtual time quantity: a queue is eligible only if the system
virtual time has reached the virtual start time of the
queue. Finally, bfq guarantees that, when a new queue must be set
in service, there is always at least one eligible entity for each
active parent entity in the scheduler. To provide this guarantee,
the function __bfq_lookup_next_entity pushes up, for each parent
entity on which it is invoked, the system virtual time to the
minimum among the virtual start times of the entities in the
active tree for the parent entity (more precisely, the push up
occurs if the system virtual time happens to be lower than all
such virtual start times).

There is however a circumstance in which __bfq_lookup_next_entity
cannot push up the system virtual time for a parent entity, even
if the system virtual time is lower than the virtual start times
of all the child entities in the active tree. It happens if one of
the child entities is in service. In fact, in such a case, there
is already an eligible entity, the in-service one, even if it may
not be not present in the active tree (because in-service entities
may be removed from the active tree).

Unfortunately, in the last re-design of the
hierarchical-scheduling engine, the reset of the pointer to the
in-service entity for a given parent entity--reset to be done as a
consequence of the expiration of the in-service entity--always
happens after the function __bfq_lookup_next_entity has been
invoked. This causes the function to think that there is still an
entity in service for the parent entity, and then that the system
virtual time cannot be pushed up, even if actually such a
no-more-in-service entity has already been properly reinserted
into the active tree (or in some other tree if no more
active). Yet, the system virtual time *had* to be pushed up, to be
ready to correctly choose the next queue to serve. Because of the
lack of this push up, bfq may wrongly set in service a queue that
had been speculatively pre-computed as the possible
next-in-service queue, but that would no more be the one to serve
after the expiration and the reinsertion into the active trees of
the previously in-service entities.

This commit addresses this issue by making
__bfq_lookup_next_entity properly push up the system virtual time
if an expiration is occurring.

Signed-off-by: Paolo Valente <paolo.valente@...aro.org>
Tested-by: Lee Tibbert <lee.tibbert@...il.com>
Tested-by: Oleksander Natalenko <oleksandr@...alenko.name>
---
 block/bfq-iosched.c |  4 ++--
 block/bfq-iosched.h |  4 ++--
 block/bfq-wf2q.c    | 58 +++++++++++++++++++++++++++++++++++++++--------------
 3 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 436b6ca..a10f147 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -720,7 +720,7 @@ static void bfq_updated_next_req(struct bfq_data *bfqd,
 		entity->budget = new_budget;
 		bfq_log_bfqq(bfqd, bfqq, "updated next rq: new budget %lu",
 					 new_budget);
-		bfq_requeue_bfqq(bfqd, bfqq);
+		bfq_requeue_bfqq(bfqd, bfqq, false);
 	}
 }
 
@@ -2563,7 +2563,7 @@ static void __bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 
 		bfq_del_bfqq_busy(bfqd, bfqq, true);
 	} else {
-		bfq_requeue_bfqq(bfqd, bfqq);
+		bfq_requeue_bfqq(bfqd, bfqq, true);
 		/*
 		 * Resort priority tree of potential close cooperators.
 		 */
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 859f0a8..3e2659c 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -817,7 +817,6 @@ extern const int bfq_timeout;
 struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync);
 void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync);
 struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic);
-void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq);
 void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq);
 void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_entity *entity,
 			  struct rb_root *root);
@@ -917,7 +916,8 @@ void __bfq_bfqd_reset_in_service(struct bfq_data *bfqd);
 void bfq_deactivate_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 			 bool ins_into_idle_tree, bool expiration);
 void bfq_activate_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq);
-void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq);
+void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+		      bool expiration);
 void bfq_del_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 		       bool expiration);
 void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq);
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index 3183b39..138732e 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -44,7 +44,8 @@ static unsigned int bfq_class_idx(struct bfq_entity *entity)
 		BFQ_DEFAULT_GRP_CLASS - 1;
 }
 
-static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd);
+static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd,
+						 bool expiration);
 
 static bool bfq_update_parent_budget(struct bfq_entity *next_in_service);
 
@@ -54,6 +55,8 @@ static bool bfq_update_parent_budget(struct bfq_entity *next_in_service);
  * @new_entity: if not NULL, pointer to the entity whose activation,
  *		requeueing or repositionig triggered the invocation of
  *		this function.
+ * @expiration: id true, this function is being invoked after the
+ *             expiration of the in-service entity
  *
  * This function is called to update sd->next_in_service, which, in
  * its turn, may change as a consequence of the insertion or
@@ -72,7 +75,8 @@ static bool bfq_update_parent_budget(struct bfq_entity *next_in_service);
  * entity.
  */
 static bool bfq_update_next_in_service(struct bfq_sched_data *sd,
-				       struct bfq_entity *new_entity)
+				       struct bfq_entity *new_entity,
+				       bool expiration)
 {
 	struct bfq_entity *next_in_service = sd->next_in_service;
 	bool parent_sched_may_change = false;
@@ -130,7 +134,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd,
 		if (replace_next)
 			next_in_service = new_entity;
 	} else /* invoked because of a deactivation: lookup needed */
-		next_in_service = bfq_lookup_next_entity(sd);
+		next_in_service = bfq_lookup_next_entity(sd, expiration);
 
 	if (next_in_service) {
 		parent_sched_may_change = !sd->next_in_service ||
@@ -1134,10 +1138,12 @@ static void __bfq_activate_requeue_entity(struct bfq_entity *entity,
  * @requeue: true if this is a requeue, which implies that bfqq is
  *	     being expired; thus ALL its ancestors stop being served and must
  *	     therefore be requeued
+ * @expiration: true if this function is being invoked in the expiration path
+ *             of the in-service queue
  */
 static void bfq_activate_requeue_entity(struct bfq_entity *entity,
 					bool non_blocking_wait_rq,
-					bool requeue)
+					bool requeue, bool expiration)
 {
 	struct bfq_sched_data *sd;
 
@@ -1145,7 +1151,8 @@ static void bfq_activate_requeue_entity(struct bfq_entity *entity,
 		sd = entity->sched_data;
 		__bfq_activate_requeue_entity(entity, sd, non_blocking_wait_rq);
 
-		if (!bfq_update_next_in_service(sd, entity) && !requeue)
+		if (!bfq_update_next_in_service(sd, entity, expiration) &&
+		    !requeue)
 			break;
 	}
 }
@@ -1201,6 +1208,8 @@ bool __bfq_deactivate_entity(struct bfq_entity *entity, bool ins_into_idle_tree)
  * bfq_deactivate_entity - deactivate an entity representing a bfq_queue.
  * @entity: the entity to deactivate.
  * @ins_into_idle_tree: true if the entity can be put into the idle tree
+ * @expiration: true if this function is being invoked in the expiration path
+ *             of the in-service queue
  */
 static void bfq_deactivate_entity(struct bfq_entity *entity,
 				  bool ins_into_idle_tree,
@@ -1229,7 +1238,7 @@ static void bfq_deactivate_entity(struct bfq_entity *entity,
 			 * then, since entity has just been
 			 * deactivated, a new one must be found.
 			 */
-			bfq_update_next_in_service(sd, NULL);
+			bfq_update_next_in_service(sd, NULL, expiration);
 
 		if (sd->next_in_service || sd->in_service_entity) {
 			/*
@@ -1288,7 +1297,7 @@ static void bfq_deactivate_entity(struct bfq_entity *entity,
 		__bfq_requeue_entity(entity);
 
 		sd = entity->sched_data;
-		if (!bfq_update_next_in_service(sd, entity) &&
+		if (!bfq_update_next_in_service(sd, entity, expiration) &&
 		    !expiration)
 			/*
 			 * next_in_service unchanged or not causing
@@ -1423,12 +1432,14 @@ __bfq_lookup_next_entity(struct bfq_service_tree *st, bool in_service)
 /**
  * bfq_lookup_next_entity - return the first eligible entity in @sd.
  * @sd: the sched_data.
+ * @expiration: true if we are on the expiration path of the in-service queue
  *
  * This function is invoked when there has been a change in the trees
- * for sd, and we need know what is the new next entity after this
- * change.
+ * for sd, and we need to know what is the new next entity to serve
+ * after this change.
  */
-static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd)
+static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd,
+						 bool expiration)
 {
 	struct bfq_service_tree *st = sd->service_tree;
 	struct bfq_service_tree *idle_class_st = st + (BFQ_IOPRIO_CLASSES - 1);
@@ -1455,8 +1466,24 @@ static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd)
 	 * class, unless the idle class needs to be served.
 	 */
 	for (; class_idx < BFQ_IOPRIO_CLASSES; class_idx++) {
+		/*
+		 * If expiration is true, then bfq_lookup_next_entity
+		 * is being invoked as a part of the expiration path
+		 * of the in-service queue. In this case, even if
+		 * sd->in_service_entity is not NULL,
+		 * sd->in_service_entiy at this point is actually not
+		 * in service any more, and, if needed, has already
+		 * been properly queued or requeued into the right
+		 * tree. The reason why sd->in_service_entity is still
+		 * not NULL here, even if expiration is true, is that
+		 * sd->in_service_entiy is reset as a last step in the
+		 * expiration path. So, if expiration is true, tell
+		 * __bfq_lookup_next_entity that there is no
+		 * sd->in_service_entity.
+		 */
 		entity = __bfq_lookup_next_entity(st + class_idx,
-						  sd->in_service_entity);
+						  sd->in_service_entity &&
+						  !expiration);
 
 		if (entity)
 			break;
@@ -1569,7 +1596,7 @@ struct bfq_queue *bfq_get_next_queue(struct bfq_data *bfqd)
 	for_each_entity(entity) {
 		struct bfq_sched_data *sd = entity->sched_data;
 
-		if (!bfq_update_next_in_service(sd, NULL))
+		if (!bfq_update_next_in_service(sd, NULL, false))
 			break;
 	}
 
@@ -1617,16 +1644,17 @@ void bfq_activate_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 	struct bfq_entity *entity = &bfqq->entity;
 
 	bfq_activate_requeue_entity(entity, bfq_bfqq_non_blocking_wait_rq(bfqq),
-				    false);
+				    false, false);
 	bfq_clear_bfqq_non_blocking_wait_rq(bfqq);
 }
 
-void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq)
+void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+		      bool expiration)
 {
 	struct bfq_entity *entity = &bfqq->entity;
 
 	bfq_activate_requeue_entity(entity, false,
-				    bfqq == bfqd->in_service_queue);
+				    bfqq == bfqd->in_service_queue, expiration);
 }
 
 /*
-- 
2.10.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ