linux-kernel - Re: [PATCH] maximize dispatching in block throttle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101201144128.GC4672@redhat.com>
Date:	Wed, 1 Dec 2010 09:41:28 -0500
From:	Vivek Goyal <vgoyal@...hat.com>
To:	Hillf Danton <dhillf@...il.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] maximize dispatching in block throttle

On Wed, Dec 01, 2010 at 09:30:00PM +0800, Hillf Danton wrote:
> On Tue, Nov 30, 2010 at 10:57 PM, Vivek Goyal <vgoyal@...hat.com> wrote:
> > On Fri, Nov 26, 2010 at 10:46:01PM +0800, Hillf Danton wrote:
> >> When dispatching bio, the quantum is divided into read/write budgets,
> >> and dispatching for write could not exceed the write budget even if
> >> the read budget is not exhausted, either dispatching for read.
> >>
> >> It is changed to exhaust the quantum, if possible, in this work for
> >> dispatching bio.
> >>
> >> Though it is hard to understand that 50/50 division is not selected,
> >> the difference between divisions could impact little on dispatching as
> >> much as quantum allows then.
> >>
> >> Signed-off-by: Hillf Danton <dhillf@...il.com>
> >> ---
> >
> > Hi Hillf,
> >
> > Even if there are not enough READS/WRITES to consume the quantum, I don't
> > think that it changes anyting much. The next dispatch round will be
> 
> Why not exhaust quantum this round directly if not throttled?
> 

Yes we can do that. I am wondering what do we gain by putting extra code
in?

> > scheduled almost immediately (If there are bios which are ready to
> > be dispatched). Look at throtl_schedule_next_dispatch().
> >
> > Have you noticed some issues/improvements with this patch?
> 
> Originally I wanted to get 75/25 division parameterized through sysfs or proc,
> but I changed mind since exhausting quantum is simpler and much applicable.

But these are two different things isn't it?

We can introduce some sysfs tunables (if need be) so that a user can decide
the division of READS/WRITES.

> 
> As you see, the application environments change from one user to another,
> even though are more latency sensitive.
> 
> It looks nicer, I think, to provide both the default division and the
> methods to
> change for users to play their games.

Yep, tunables can help here and introducing tunables is easy. At the same
time I prefer to intoroduce tunables only on need basis otherwise it
runs the risk of too many tunables which are not being used.

So, conceptually this patch looks fine to me. Jeff Moyer also raised the
same question of why not fill up the quantum if READ/WRITE are not using
their quota. So I think it does not hurt to take this patch in.

Acked-by: Vivek Goyal <vgoyal@...hat.com> 

Thanks
Vivek

> 
> Thanks
> Hillf
> 
> >
> > Generally READS are more latency sensitive as compared to WRITE hence
> > I thought of dispatching more READS per quantum.
> >
> > Thanks
> > Vivek
> >
> >>
> >> --- a/block/blk-throttle.c    2010-11-01 19:54:12.000000000 +0800
> >> +++ b/block/blk-throttle.c    2010-11-26 21:49:00.000000000 +0800
> >> @@ -647,11 +647,16 @@ static int throtl_dispatch_tg(struct thr
> >>       unsigned int max_nr_reads = throtl_grp_quantum*3/4;
> >>       unsigned int max_nr_writes = throtl_grp_quantum - nr_reads;
> >>       struct bio *bio;
> >> +     int read_throttled = 0, write_throttled = 0;
> >>
> >>       /* Try to dispatch 75% READS and 25% WRITES */
> >> -
> >> + try_read:
> >>       while ((bio = bio_list_peek(&tg->bio_lists[READ]))
> >> -             && tg_may_dispatch(td, tg, bio, NULL)) {
> >> +             && ! read_throttled) {
> >> +             if (! tg_may_dispatch(td, tg, bio, NULL)) {
> >> +                     read_throttled = 1;
> >> +                     break;
> >> +             }
> >>
> >>               tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
> >>               nr_reads++;
> >> @@ -659,9 +664,15 @@ static int throtl_dispatch_tg(struct thr
> >>               if (nr_reads >= max_nr_reads)
> >>                       break;
> >>       }
> >> -
> >> +     if (! bio)
> >> +             read_throttled = 1;
> >> + try_write:
> >>       while ((bio = bio_list_peek(&tg->bio_lists[WRITE]))
> >> -             && tg_may_dispatch(td, tg, bio, NULL)) {
> >> +             && ! write_throttled) {
> >> +             if (! tg_may_dispatch(td, tg, bio, NULL)) {
> >> +                     write_throttled = 1;
> >> +                     break;
> >> +             }
> >>
> >>               tg_dispatch_one_bio(td, tg, bio_data_dir(bio), bl);
> >>               nr_writes++;
> >> @@ -669,7 +680,23 @@ static int throtl_dispatch_tg(struct thr
> >>               if (nr_writes >= max_nr_writes)
> >>                       break;
> >>       }
> >> +     if (! bio)
> >> +             write_throttled = 1;
> >> +
> >> +     if (write_throttled && read_throttled)
> >> +             goto out;
> >>
> >> +     if (! (throtl_grp_quantum > nr_writes + nr_reads))
> >> +             goto out;
> >> +
> >> +     if (read_throttled) {
> >> +             max_nr_writes = throtl_grp_quantum - nr_reads;
> >> +             goto try_write;
> >> +     } else {
> >> +             max_nr_reads = throtl_grp_quantum - nr_writes;
> >> +             goto try_read;
> >> +     }
> >> + out:
> >>       return nr_reads + nr_writes;
> >>  }
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@...r.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> Please read the FAQ at  http://www.tux.org/lkml/
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/