linux-kernel - Re: testing io.low limit for blk-throttle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <6D0FF312-8A23-425F-B2D7-9F220887FB31@linaro.org>
Date:   Fri, 27 Apr 2018 07:14:16 +0200
From:   Paolo Valente <paolo.valente@...aro.org>
To:     Joseph Qi <jiangqi903@...il.com>
Cc:     linux-block <linux-block@...r.kernel.org>,
        Jens Axboe <axboe@...nel.dk>, Shaohua Li <shli@...com>,
        Mark Brown <broonie@...nel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
        'Paolo Valente' via bfq-iosched 
        <bfq-iosched@...glegroups.com>
Subject: Re: testing io.low limit for blk-throttle



> Il giorno 27 apr 2018, alle ore 05:27, Joseph Qi <jiangqi903@...il.com> ha scritto:
> 
> Hi Paolo,
> 
> On 18/4/27 01:27, Paolo Valente wrote:
>> 
>> 
>>> Il giorno 25 apr 2018, alle ore 14:13, Joseph Qi <jiangqi903@...il.com> ha scritto:
>>> 
>>> Hi Paolo,
>>> 
>> 
>> Hi Joseph
>> 
>>> ...
>>> Could you run blktrace as well when testing your case? There are several
>>> throtl traces to help analyze whether it is caused by frequently
>>> upgrade/downgrade.
>> 
>> Certainly.  You can find a trace attached.  Unfortunately, I'm not
>> familiar with the internals of blk-throttle and low limit, so, if you
>> want me to analyze the trace, give me some hints on what I have to
>> look for.  Otherwise, I'll be happy to learn from your analysis.
>> 
> 
> I've taken a glance at your blktrace attached. It is only upgrade at first and
> then downgrade (just adjust limit, not to LIMIT_LOW) frequently.
> But I don't know why it always thinks throttle group is not idle.
> 
> For example:
> fio-2336  [004] d...   428.458249:   8,16   m   N throtl avg_idle=90, idle_threshold=1000, bad_bio=10, total_bio=84, is_idle=0, scale=9
> fio-2336  [004] d...   428.458251:   8,16   m   N throtl downgrade, scale 4
> 
> In throtl_tg_is_idle():
> is_idle = ... ||
> 	(tg->latency_target && tg->bio_cnt &&
> 	 tg->bad_bio_cnt * 5 < tg->bio_cnt);
> 
> It should be idle and allow run more bandwidth. But here the result shows not
> idle (is_idle=0). I have to do more investigation to figure it out why. 
> 

Hi Joseph,
actually this doesn't surprise me much, for this scenario I expected
exactly that blk-throttle would have considered the random-I/O group,
for most of the time,
1) non idle,
2) above the 100usec target latency, and
3) below low limit,

In fact,
1) The group can evidently issue I/O at a much higher rate than that
received, so, immediately after its last pending I/O has been served,
the group issues new I/O; in the end, it is is non idle most of the
time
2) To try to enforce the 10MB/s limit, blk-throttle necessarily makes
the group oscillate around 10MB/s, which means that the group is
frequently below limit (this would not have held only if the group had
actually received much more than 10MB/s, but it is not so)
3) For each of the 4k random I/Os of the group, the time needed by the
drive to serve that I/O is already around 40-50usec.  So, since the
group is of course not constantly in service, it is very easy that,
because of throttling, the latency of most I/Os of the group goes
beyond 100usec.

But, as it is often the case for me, I might have simply misunderstood
blk-throttle parameters, and I might be just wrong here.

Thanks,
Paolo

> You can also filter these logs using:
> grep throtl trace | grep -E 'upgrade|downgrade|is_idle'
> 
> Thanks,
> Joseph