[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <abaa2003-4ddf-5ef9-d62c-1708a214609d@kernel.dk>
Date: Wed, 7 Dec 2022 16:08:33 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Gulam Mohamed <gulam.mohamed@...cle.com>,
linux-block@...r.kernel.org
Cc: philipp.reisner@...bit.com, lars.ellenberg@...bit.com,
christoph.boehmwalder@...bit.com, minchan@...nel.org,
ngupta@...are.org, senozhatsky@...omium.org, colyli@...e.de,
kent.overstreet@...il.com, agk@...hat.com, snitzer@...nel.org,
dm-devel@...hat.com, song@...nel.org, dan.j.williams@...el.com,
vishal.l.verma@...el.com, dave.jiang@...el.com,
ira.weiny@...el.com, junxiao.bi@...cle.com,
martin.petersen@...cle.com, kch@...dia.com,
drbd-dev@...ts.linbit.com, linux-kernel@...r.kernel.org,
linux-bcache@...r.kernel.org, linux-raid@...r.kernel.org,
nvdimm@...ts.linux.dev, konrad.wilk@...cle.com, joe.jin@...cle.com
Subject: Re: [RFC for-6.2/block V2] block: Change the granularity of io ticks
from ms to ns
On 12/7/22 3:32?PM, Gulam Mohamed wrote:
> As per the review comment from Jens Axboe, I am re-sending this patch
> against "for-6.2/block".
>
>
> Use ktime to change the granularity of IO accounting in block layer from
> milli-seconds to nano-seconds to get the proper latency values for the
> devices whose latency is in micro-seconds. After changing the granularity
> to nano-seconds the iostat command, which was showing incorrect values for
> %util, is now showing correct values.
>
> We did not work on the patch to drop the logic for
> STAT_PRECISE_TIMESTAMPS yet. Will do it if this patch is ok.
>
> The iostat command was run after starting the fio with following command
> on an NVME disk. For the same fio command, the iostat %util was showing
> ~100% for the disks whose latencies are in the range of microseconds.
> With the kernel changes (granularity to nano-seconds), the %util was
> showing correct values. Following are the details of the test and their
> output:
My default peak testing runs at 122M IOPS. That's also the peak IOPS of
the devices combined, and with iostats disabled. If I enabled iostats,
then the performance drops to 112M IOPS. It's no longer device limited,
that's a drop of about 8.2%.
Adding this patch, and with iostats enabled, performance is at 91M IOPS.
That's a ~25% drop from no iostats, and a ~19% drop from the iostats we
have now...
Here's what I'd like to see changed:
- Split the patch up. First change all the types from unsigned long to
u64, that can be done while retaining jiffies.
- Add an iostats == 2 setting, which enables this higher resolution
mode. We'd still default to 1, lower granularity iostats enabled.
I think that's cleaner than one big patch, and means that patch 1 should
not really have any noticeable changes. That's generally how I like to
get things split. With that, then I think there could be a way to get
this included.
--
Jens Axboe
Powered by blists - more mailing lists