linux-kernel - Re: [RFC] block: Change the granularity of io ticks from ms to ns

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b8deb6fa-8a09-c1af-278f-24e66afe367d@kernel.dk>
Date:   Wed, 7 Dec 2022 10:22:09 -0700
From:   Jens Axboe <axboe@...nel.dk>
To:     Yu Kuai <yukuai1@...weicloud.com>, Ming Lei <ming.lei@...hat.com>
Cc:     Gulam Mohamed <gulam.mohamed@...cle.com>,
        linux-block@...r.kernel.org, philipp.reisner@...bit.com,
        lars.ellenberg@...bit.com, christoph.boehmwalder@...bit.com,
        minchan@...nel.org, ngupta@...are.org, senozhatsky@...omium.org,
        colyli@...e.de, kent.overstreet@...il.com, agk@...hat.com,
        snitzer@...nel.org, dm-devel@...hat.com, song@...nel.org,
        dan.j.williams@...el.com, vishal.l.verma@...el.com,
        dave.jiang@...el.com, ira.weiny@...el.com, junxiao.bi@...cle.com,
        martin.petersen@...cle.com, kch@...dia.com,
        drbd-dev@...ts.linbit.com, linux-kernel@...r.kernel.org,
        linux-bcache@...r.kernel.org, linux-raid@...r.kernel.org,
        nvdimm@...ts.linux.dev, konrad.wilk@...cle.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [RFC] block: Change the granularity of io ticks from ms to ns

On 12/7/22 6:09 AM, Yu Kuai wrote:
> Hi,
> 
> 在 2022/12/07 11:15, Ming Lei 写道:
>> On Wed, Dec 07, 2022 at 10:19:08AM +0800, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2022/12/07 2:15, Gulam Mohamed 写道:
>>>> Use ktime to change the granularity of IO accounting in block layer from
>>>> milli-seconds to nano-seconds to get the proper latency values for the
>>>> devices whose latency is in micro-seconds. After changing the granularity
>>>> to nano-seconds the iostat command, which was showing incorrect values for
>>>> %util, is now showing correct values.
>>>
>>> This patch didn't correct the counting of io_ticks, just make the
>>> error accounting from jiffies(ms) to ns. The problem that util can be
>>> smaller or larger still exist.
>>
>> Agree.
>>
>>>
>>> However, I think this change make sense consider that error margin is
>>> much smaller, and performance overhead should be minimum.
>>>
>>> Hi, Ming, how do you think?
>>
>> I remembered that ktime_get() has non-negligible overhead, is there any
>> test data(iops/cpu utilization) when running fio or t/io_uring on
>> null_blk with this patch?
> 
> Yes, testing with null_blk is necessary, we don't want any performance
> regression.

null_blk is fine as a substitute, but I'd much rather run this on my
test bench with actual IO and devices.

> BTW, I thought it's fine because it's already used for tracking io
> latency.

Reading a nsec timestamp is a LOT more expensive than reading jiffies,
which is essentially free. If you look at the amount of work that's
gone into minimizing ktime_get() for the fast path in the IO stack,
then that's a testament to that.

So that's a very bad assumption, and definitely wrong.

-- 
Jens Axboe