linux-kernel - Re: [PATCH 2/2] blk-throttle: fix off-by-one jiffies wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e6a29a6a-f5ec-7953-14e9-2550a549f955@huaweicloud.com>
Date: Mon, 24 Feb 2025 15:03:18 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Ming Lei <ming.lei@...hat.com>, Yu Kuai <yukuai1@...weicloud.com>
Cc: tj@...nel.org, josef@...icpanda.com, axboe@...nel.dk,
 cgroups@...r.kernel.org, linux-block@...r.kernel.org,
 linux-kernel@...r.kernel.org, yi.zhang@...wei.com, yangerkun@...wei.com,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH 2/2] blk-throttle: fix off-by-one jiffies wait_time

Hi, Ming!

在 2025/02/24 11:28, Ming Lei 写道:
> throtl_trim_slice() returns immediately if throtl_slice_used()
> is true.
> 
> And throtl_slice_used() checks jiffies in [start, end] via time_in_range(),
> so if `start <= jiffies <= end', it still returns false.

Yes, I misread the code, by thinking throtl_slice_used() will return
true if the slice is still used. :(

>> BTW, throtl_trim_slice() looks like problematic:
>>
>> -       if (bytes_trim <= 0 && io_trim <= 0)
>> +       if (bytes_trim <= 0 || io_trim <= 0 ||
>> +           tg->bytes_disp[rw] < bytes_trim || tg->io_disp[rw] < io_trim)
>>                  return;
> That is exactly what my patch is doing, just taking deviation and
> timeout into account, also U64_MAX limit has to be excluded.
Yes, perhaps you can add some comments in the last two conditions of
your patch. I think maybe you're concerned about the case IO is
throttled by IOs and bytes_disp should really erase extra bytes that
does not reach bps_limit.

> 
> If you test the patch, it works just fine. My patch controls bytes_exp
> <= 1.5 * disp, then throtl/001 can be completed in 1.5sec, and if it is
> changed to 1.25 * disp, the test can be completed in 1.25sec.
> 
> With this fix, why do we have to play the complicated carryover
> trick?
> 

I understand your code now. carryover_bytes in this case is wrong, as
long as new slice is not started, and trim slice doesn't erase extra
bytes by mistake, throttle time should be accurate over time because
bytes_disp is accurate.

And root cause really is throtl_trim_slice().

However, by code review, I think throtl_start_new_slice() should be
handled as well, the same as throtl_trim_slice(), if the old bio is
dispatched with one more jiffies wait time, issue a new bio in the same
jiffies will have the same problem. For example:

Caller do sync IO, with io size: (bps_limit / (HZ / throtl_slice) + 1),
and caller will issue new IO when old IO is done. Then in this case,
each IO will start a new slice and wait for throtl_slice + 1 jiffies. I
believe this can be fixed as well so that BIOs after the first one will
only wait for throtl_silce jiffies.

> Thanks,
> Ming
> 
> 
> .
>