linux-kernel - Re: [PATCHv2 2/2] block: adjust CFS request expire time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGWkznGW4xUyhxySajAHginW9wz3GNB_iV5FUEkGD5h__YVUTw@mail.gmail.com>
Date: Tue, 20 Feb 2024 19:56:17 +0800
From: Zhaoyang Huang <huangzhaoyang@...il.com>
To: "zhaoyang.huang" <zhaoyang.huang@...soc.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Jens Axboe <axboe@...nel.dk>, linux-block@...r.kernel.org, linux-kernel@...r.kernel.org, 
	steve.kang@...soc.com
Subject: Re: [PATCHv2 2/2] block: adjust CFS request expire time

Patchv2 make the adjustment work as a CFS's over-preempted guard which
only take effect for READ

On Tue, Feb 20, 2024 at 7:46 PM zhaoyang.huang
<zhaoyang.huang@...soc.com> wrote:
>
> From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
>
> According to current policy, CFS's may suffer involuntary IO-latency by
> being preempted by RT/DL tasks or IRQ since they possess the privilege for
> both of CPU and IO scheduler. This commit introduce an approximate and
> light method to decrease these affection by adjusting the expire time
> via the CFS's proportion among the whole cpu active time.
> The average utilization of cpu's run queue could reflect the historical
> active proportion of different types of task that can be proved valid for
> this goal from belowing three perspective,
>
> 1. All types of sched class's load(util) are tracked and calculated in the
> same way(using a geometric series which known as PELT)
> 2. Keep the legacy policy by NOT adjusting rq's position in fifo_list
> but only make changes over expire_time.
> 3. The fixed expire time(hundreds of ms) is in the same range of cpu
> avg_load's account series(the utilization will be decayed to 0.5 in 32ms)
>
> TaskA
> sched in
> |
> |
> |
> submit_bio
> |
> |
> |
> fifo_time = jiffies + expire
> (insert_request)
>
> TaskB
> sched in
> |
> |
> vfs_xxx
> |
> |preempted by RT,DL,IRQ
> |\
> | This period time is unfair to TaskB's IO request, should be adjust
> |/
> |
> submit_bio
> |
> |
> |
> fifo_time = jiffies + expire * CFS_PROPORTION(rq)
> (insert_request)
>
> Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> ---
> change of v2: introduce direction and threshold to make the hack working
> as a guard for CFS's over-preempted.
> ---
> ---
>  block/mq-deadline.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index f958e79277b8..b5aa544d69a3 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -54,6 +54,7 @@ enum dd_prio {
>
>  enum { DD_PRIO_COUNT = 3 };
>
> +#define CFS_PROP_THRESHOLD 60
>  /*
>   * I/O statistics per I/O priority. It is fine if these counters overflow.
>   * What matters is that these counters are at least as wide as
> @@ -802,6 +803,7 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>         u8 ioprio_class = IOPRIO_PRIO_CLASS(ioprio);
>         struct dd_per_prio *per_prio;
>         enum dd_prio prio;
> +       int fifo_expire;
>
>         lockdep_assert_held(&dd->lock);
>
> @@ -839,8 +841,20 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>
>                 /*
>                  * set expire time and add to fifo list
> +                * The expire time is adjusted when current CFS task is
> +                * over-preempted by RT/DL/IRQ which is calculated by the
> +                * proportion of CFS's activation among whole cpu time during
> +                * last several dozen's ms.Whearas, this would NOT affect the
> +                * rq's position in fifo_list but only take effect when this
> +                * rq is checked for its expire time when at head.
>                  */
> -               rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
> +               fifo_expire = dd->fifo_expire[data_dir];
> +               if (data_dir == DD_READ &&
> +                       (cfs_prop_by_util(current, 100) < CFS_PROP_THRESHOLD))
> +                       fifo_expire = cfs_prop_by_util(current, dd->fifo_expire[data_dir]);
> +
> +               rq->fifo_time = jiffies + fifo_expire;
> +
>                 insert_before = &per_prio->fifo_list[data_dir];
>  #ifdef CONFIG_BLK_DEV_ZONED
>                 /*
> --
> 2.25.1
>