[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140605023334.GB22826@kernel.org>
Date: Thu, 5 Jun 2014 10:33:34 +0800
From: Shaohua Li <shli@...nel.org>
To: Jens Axboe <axboe@...nel.dk>
Cc: Matias Bjørling <m@...rling.me>,
"Sam Bradshaw (sbradshaw)" <sbradshaw@...ron.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] block: per-cpu counters for in-flight IO accounting
On Wed, Jun 04, 2014 at 08:16:32PM -0600, Jens Axboe wrote:
> On 2014-06-04 20:09, Shaohua Li wrote:
> >On Wed, Jun 04, 2014 at 02:08:46PM -0600, Jens Axboe wrote:
> >>On 06/04/2014 05:29 AM, Matias Bjørling wrote:
> >>>It's in
> >>>
> >>>blk_io_account_start
> >>> part_round_stats
> >>> part_round_state_single
> >>> part_in_flight
> >>>
> >>>I like the granularity idea.
> >>
> >>And similarly from blk_io_account_done() - which makes it even worse,
> >>since it at both ends of the IO chain.
> >
> >But part_round_state_single is supposed to only call part_in_flight every
> >jiffery. Maybe we need something below:
> >1. set part->stamp immediately
> >2. fixed granularity
> >Untested though.
> >
> >
> >diff --git a/block/blk-core.c b/block/blk-core.c
> >index 40d6548..5f0acaa 100644
> >--- a/block/blk-core.c
> >+++ b/block/blk-core.c
> >@@ -1270,17 +1270,19 @@ static void part_round_stats_single(int cpu, struct hd_struct *part,
> > unsigned long now)
> > {
> > int inflight;
> >+ unsigned long old_stamp;
> >
> >- if (now == part->stamp)
> >+ if (time_before(now, part->stamp + msecs_to_jiffies(10)))
> > return;
> >+ old_stamp = part->stamp;
> >+ part->stamp = now;
> >
> > inflight = part_in_flight(part);
> > if (inflight) {
> > __part_stat_add(cpu, part, time_in_queue,
> >- inflight * (now - part->stamp));
> >- __part_stat_add(cpu, part, io_ticks, (now - part->stamp));
> >+ inflight * (now - old_stamp));
> >+ __part_stat_add(cpu, part, io_ticks, (now - old_stamp));
> > }
> >- part->stamp = now;
> > }
> >
> > /**
>
> It'd be a good improvement, and one we should be able to do without
> screwing anything up. It'd be identical to anyone running at HZ==100
> right now.
>
> So the above we can easily do, and arguably should just do. We wont
> see real scaling in the IO stats path before we fixup the hd_struct
> referencing as well, however.
That's true. maybe a percpu_ref works here.
Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists