lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 27 Jul 2009 18:41:38 -0400
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Jerome Marchand <jmarchan@...hat.com>
Cc:	linux-kernel@...r.kernel.org,
	containers@...ts.linux-foundation.org, dm-devel@...hat.com,
	jens.axboe@...cle.com, nauman@...gle.com, dpshah@...gle.com,
	ryov@...inux.co.jp, guijianfeng@...fujitsu.com,
	balbir@...ux.vnet.ibm.com, righi.andrea@...il.com,
	lizf@...fujitsu.com, mikew@...gle.com, fchecconi@...il.com,
	paolo.valente@...more.it, fernando@....ntt.co.jp,
	s-uchida@...jp.nec.com, taka@...inux.co.jp, jmoyer@...hat.com,
	dhaval@...ux.vnet.ibm.com, m-ikeda@...jp.nec.com, agk@...hat.com,
	akpm@...ux-foundation.org, peterz@...radead.org
Subject: Re: [PATCH 03/24] io-controller: bfq support of in-class preemption

On Mon, Jul 27, 2009 at 06:54:54PM +0200, Jerome Marchand wrote:
> Vivek Goyal wrote:
> > o Generally preemption is associated with cross class where if an request
> >   from RT class is pending it will preempt the ongoing BE or IDLE class
> >   request.
> > 
> > o CFQ also does in-class preemtions like a sync request queue preempting the
> >   async request queue. In that case it looks like preempting queue gains
> >   share and it is not fair.
> > 
> > o Implement the similar functionality in bfq so that we can retain the
> >   existing CFQ behavior.
> > 
> > o This patch creates a bypass path so that a queue can be put at the
> >   front of the service tree (add_front, similar to CFQ), so that it will
> >   be selected next to run. That's a different thing that in the process
> >   this queue gains share.
> > 
> > Signed-off-by: Vivek Goyal <vgoyal@...hat.com>
> > ---
> >  block/elevator-fq.c |   46 +++++++++++++++++++++++++++++++++++++++++-----
> >  1 files changed, 41 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/elevator-fq.c b/block/elevator-fq.c
> > index e5f39cf..f1ab0dc 100644
> > --- a/block/elevator-fq.c
> > +++ b/block/elevator-fq.c
> > @@ -267,7 +267,8 @@ static void bfq_get_entity(struct io_entity *entity)
> >  		elv_get_ioq(ioq);
> >  }
> >  
> > -static void bfq_init_entity(struct io_entity *entity, struct io_group *iog)
> > +static inline void
> > +bfq_init_entity(struct io_entity *entity, struct io_group *iog)
> >  {
> >  	entity->sched_data = &iog->sched_data;
> >  }
> > @@ -580,7 +581,7 @@ static struct io_entity *bfq_lookup_next_entity(struct io_sched_data *sd,
> >   * service received if @entity is active) of the queue to calculate its
> >   * timestamps.
> >   */
> > -static void __bfq_activate_entity(struct io_entity *entity)
> > +static void __bfq_activate_entity(struct io_entity *entity, int add_front)
> >  {
> >  	struct io_sched_data *sd = entity->sched_data;
> >  	struct io_service_tree *st = io_entity_service_tree(entity);
> > @@ -625,7 +626,42 @@ static void __bfq_activate_entity(struct io_entity *entity)
> >  	}
> >  
> >  	st = __bfq_entity_update_prio(st, entity);
> > -	bfq_calc_finish(entity, entity->budget);
> > +	/*
> > +	 * This is to emulate cfq like functionality where preemption can
> > +	 * happen with-in same class, like sync queue preempting async queue
> > +	 * May be this is not a very good idea from fairness point of view
> > +	 * as preempting queue gains share. Keeping it for now.
> > +	 */
> > +	if (add_front) {
> > +		struct io_entity *next_entity;
> > +
> > +		/*
> > +		 * Determine the entity which will be dispatched next
> > +		 * Use sd->next_active once hierarchical patch is applied
> > +		 */
> > +		next_entity = bfq_lookup_next_entity(sd, 0);
> > +
> > +		if (next_entity && next_entity != entity) {
> > +			struct io_service_tree *new_st;
> > +			u64 delta;
> > +
> > +			new_st = io_entity_service_tree(next_entity);
> > +
> > +			/*
> > +			 * At this point, both entities should belong to
> > +			 * same service tree as cross service tree preemption
> > +			 * is automatically taken care by algorithm
> > +			 */
> > +			BUG_ON(new_st != st);
> 
> Hi Vivek,
> 
> I don't quite understand how cross service tree preemption is taken care
> by algorithm, but I've hit this bug while doing some RT I/O and then
> killing it.
> 
> $ ionice -c 1 dd if=/dev/zero of=/tmp/foo bs=1M count=1000 &
> $ killall dd
> 

Hi Jerome,

Thanks for testing it out. I could also reproduce the issue.

I had assumed that RT queue will always preempt non-RT queue and hence if
there is an RT ioq/request pending, the sd->next_entity will point to
itself and any queue which is preempting it has to be on same service
tree.

But in your test case it looks like that RT async queue is pending and 
there is some sync BE class IO going on. It looks like that CFQ allows
sync queue preempting async queue irrespective of class, so in this case
sync BE class reader will preempt async RT queue and that's where my
assumption is broken and we see BUG_ON() hitting.

Can you please tryout following patch. It is a quick patch and requires
more testing. It solves the crash but still does not solve the issue of
sync queue always preempting async queues irrespective of class. In
current scheduler we always schedule the RT queue first (whether it be
sync or async). This problem requires little more thought.

In this patch, we just check next entity on same class service tree and
if one is present, stick new queue in front of it. We don't rely on
sd->next_active, which could be pointing to a queue of different class
in same group (scheduling_data).
 
---
 block/elevator-fq.c |   16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

Index: linux8/block/elevator-fq.c
===================================================================
--- linux8.orig/block/elevator-fq.c	2009-07-27 18:13:34.000000000 -0400
+++ linux8/block/elevator-fq.c	2009-07-27 18:18:49.000000000 -0400
@@ -946,21 +946,15 @@ static void __bfq_activate_entity(struct
 	if (add_front) {
 		struct io_entity *next_entity;
 
-		/* Determine the entity which will be dispatched next */
-		next_entity = sd->next_active;
+		/*
+		 * Determine the entity which will be dispatched next on
+		 * same service tree.
+		 */
+		next_entity = __bfq_lookup_next_entity(st);
 
 		if (next_entity && next_entity != entity) {
-			struct io_service_tree *new_st;
 			u64 delta;
 
-			new_st = io_entity_service_tree(next_entity);
-
-			/*
-			 * At this point, both entities should belong to
-			 * same service tree as cross service tree preemption
-			 * is automatically taken care by algorithm
-			 */
-			BUG_ON(new_st != st);
 			entity->finish = next_entity->finish - 1;
 			delta = bfq_delta(entity->budget, entity->weight);
 			entity->start = entity->finish - delta;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ