lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAB8ipk8KL=1dT36OhyJfzVBTkOMn8CxiwO0J_ocdHAUm2aqe4g@mail.gmail.com>
Date: Thu, 18 Apr 2024 10:50:47 +0800
From: Xuewen Yan <xuewen.yan94@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Xuewen Yan <xuewen.yan@...soc.com>, mingo@...hat.com, juri.lelli@...hat.com, 
	vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org, 
	bsegall@...gle.com, mgorman@...e.de, bristot@...hat.com, vschneid@...hat.com, 
	wuyun.abel@...edance.com, ke.wang@...soc.com, linux-kernel@...r.kernel.org, 
	Chen Yu <yu.c.chen@...el.com>
Subject: Re: [PATCH] sched/eevdf: Prevent vlag from exceeding the limit value

Hi Peter,

Sorry for the late reply.

On Mon, Apr 8, 2024 at 8:06 PM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Tue, Jan 30, 2024 at 04:06:43PM +0800, Xuewen Yan wrote:
> > There are some scenarios here that will cause vlag to
> > exceed eevdf's limit.
> >
> > 1. In set_next_entity, it will stash a copy of deadline
> > at the point of pick in vlag, and if the task fork child,
> > the child's vlag would inherit parent's vlag, as a result,
> > the child's vlag is equal to the deadline. And then,
> > when place_entity, because the se's vlag is wrong, it would
> > cause vruntime and deadline update error.
>
> But __sched_fork() clears it? Or am I missing something subtle?

Yes, you are right, I missed the __sched_fork().

>
> > 2. When reweight_entity, the vlag would be recalculated,
> > because the new weights are uncertain, and it may cause
> > the vlag to exceed eevdf's limit.
> >
> > In order to ensure that vlag does not exceed the limit,
> > separate the calculation of limit from the update_entity_lag
> > and create a new func limit_entity_lag. When the vlag is
> > updated, use this func to prevent vlag from exceeding the limit.
> >
> > And add check whether the se is forked in place_entity, and calc
> > the curr's vlag to update the se's vlag.
> >
> > Signed-off-by: Xuewen Yan <xuewen.yan@...soc.com>
> > ---
> >  kernel/sched/fair.c | 28 +++++++++++++++++++++++++---
> >  1 file changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 533547e3c90a..3fc99b4b8b41 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -696,15 +696,22 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq)
> >   *
> >   * XXX could add max_slice to the augmented data to track this.
> >   */
> > +static s64 limit_entity_lag(struct sched_entity *se, s64 lag)
> > +{
> > +     s64 limit;
> > +
> > +     limit = calc_delta_fair(max_t(u64, 2*se->slice, TICK_NSEC), se);
> > +     return clamp(lag, -limit, limit);
> > +}
>
> I used to run with a trace_printk() in there to check how often this
> trips, can you readily trigger these sites, or do you have to work hard
> at it?

I can trigger this easily. And I add dump_stack, the case is much:

[   11.372967] Call trace:
[   11.372971] dump_backtrace+0xec/0x138
[   11.372990] show_stack+0x18/0x24
[   11.372995] dump_stack_lvl+0x60/0x84
[   11.373002] dump_stack+0x18/0x24
[   11.373007] place_entity+0x210/0x25c
[   11.373015] enqueue_task_fair+0x1e0/0x658
[   11.373021] enqueue_task+0x94/0x188
[   11.373028] ttwu_do_activate+0xb4/0x27c
[   11.373034] try_to_wake_up+0x2fc/0x800
[   11.373040] wake_up_q+0x70/0xd8
[   11.373046] futex_wake+0x1e4/0x32c
[   11.373052] do_futex+0x144/0x27c
[   11.373056] __arm64_sys_futex_time32+0xa8/0x1b8
[   11.373061] invoke_syscall+0x58/0x114
[   11.373068] el0_svc_common+0xac/0xe0
[   11.373074] do_el0_svc_compat+0x1c/0x28
[   11.373079] el0_svc_compat+0x38/0x6c
[   11.373085] el0t_32_sync_handler+0x60/0x8c
[   11.373091] el0t_32_sync+0x1ac/0x1b0
[   11.373097] the lag=4562177811 vruntime=18058609737 limit=3071999998

[   38.778444] Call trace:
[   38.778447] dump_backtrace+0xec/0x138
[   38.778465] show_stack+0x18/0x24
[   38.778470] dump_stack_lvl+0x60/0x84
[   38.778479] dump_stack+0x18/0x24
[   38.778484] place_entity+0x210/0x25c
[   38.778492] enqueue_task_fair+0x1e0/0x658
[   38.778498] enqueue_task+0x94/0x188
[   38.778505] ttwu_do_activate+0xb4/0x27c
[   38.778512] try_to_wake_up+0x2fc/0x800
[   38.778518] wake_up_process+0x18/0x28
[   38.778523] hrtimer_wakeup+0x20/0x94
[   38.778530] __hrtimer_run_queues+0x19c/0x39c
[   38.778536] hrtimer_interrupt+0xdc/0x39c
[   38.778541] arch_timer_handler_phys+0x50/0x64
[   38.778549] handle_percpu_devid_irq+0xb4/0x258
[   38.778556] generic_handle_domain_irq+0x44/0x60
[   38.778563] gic_handle_irq+0x4c/0x114
[   38.778568] call_on_irq_stack+0x3c/0x74
[   38.778573] do_interrupt_handler+0x4c/0x84
[   38.778580] el0_interrupt+0x54/0xe4
[   38.778586] __el0_irq_handler_common+0x18/0x28
[   38.778591] el0t_64_irq_handler+0x10/0x1c
[   38.778596] el0t_64_irq+0x1a8/0x1ac
[   38.778602] the lag=-993821948 vruntime=162909998203 limit=877714284

[   41.205507] Call trace:
[   41.205508] dump_backtrace+0xec/0x138
[   41.205516] show_stack+0x18/0x24
[   41.205518] dump_stack_lvl+0x60/0x84
[   41.205523] dump_stack+0x18/0x24
[   41.205526] place_entity+0x210/0x25c
[   41.205530] enqueue_task_fair+0x1e0/0x658
[   41.205533] enqueue_task+0x94/0x188
[   41.205537] ttwu_do_activate+0xb4/0x27c
[   41.205540] try_to_wake_up+0x2fc/0x800
[   41.205543] default_wake_function+0x20/0x38
[   41.205547] autoremove_wake_function+0x18/0x7c
[   41.205551] receiver_wake_function+0x28/0x38
[   41.205555] __wake_up_common+0xe8/0x188
[   41.205558] __wake_up_sync_key+0x88/0x144
[   41.205561] sock_def_readable+0xb4/0x1ac
[   41.205565] unix_dgram_sendmsg+0x5b4/0x764
[   41.205569] sock_write_iter+0xe0/0x160
[   41.205571] do_iter_write+0x210/0x338
[   41.205575] do_writev+0xd8/0x198
[   41.205577] __arm64_sys_writev+0x20/0x30
[   41.205581] invoke_syscall+0x58/0x114
[   41.205584] el0_svc_common+0x80/0xe0
[   41.205588] do_el0_svc+0x1c/0x28
[   41.205591] el0_svc+0x3c/0x70
[   41.205594] el0t_64_sync_handler+0x68/0xbc
[   41.205597] el0t_64_sync+0x1a8/0x1ac
[   41.205599] the lag=-7374172458 vruntime=209576555829 limit=3071999998

[   41.416780] Call trace:
[   41.416781] dump_backtrace+0xec/0x138
[   41.416788] show_stack+0x18/0x24
[   41.416791] dump_stack_lvl+0x60/0x84
[   41.416797] dump_stack+0x18/0x24
[   41.416799] place_entity+0x210/0x25c
[   41.416804] enqueue_task_fair+0x1e0/0x658
[   41.416807] enqueue_task+0x94/0x188
[   41.416811] ttwu_do_activate+0xb4/0x27c
[   41.416814] try_to_wake_up+0x2fc/0x800
[   41.416818] default_wake_function+0x20/0x38
[   41.416821] autoremove_wake_function+0x18/0x7c
[   41.416825] __wake_up_common+0xe8/0x188
[   41.416828] __wake_up_sync_key+0x88/0x144
[   41.416831] __wake_up_sync+0x14/0x24
[   41.416834] binder_transaction+0x1794/0x1e98
[   41.416838] binder_ioctl_write_read+0x590/0x3950
[   41.416841] binder_ioctl+0x1ec/0xa40
[   41.416844] __arm64_sys_ioctl+0xa8/0xe4
[   41.416849] invoke_syscall+0x58/0x114
[   41.416853] el0_svc_common+0x80/0xe0
[   41.416857] do_el0_svc+0x1c/0x28
[   41.416860] el0_svc+0x3c/0x70
[   41.416868] the lag=3625583753 vruntime=254849351272 limit=3071999998
[   41.416863] el0t_64_sync_handler+0x68/0xbc
[   41.416866] el0t_64_sync+0x1a8/0x1ac

Thanks!
BR

---
xuewen

>
>
> >  static void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >  {
> > -     s64 lag, limit;
> > +     s64 lag;
> >
> >       SCHED_WARN_ON(!se->on_rq);
> >       lag = avg_vruntime(cfs_rq) - se->vruntime;
> >
> > -     limit = calc_delta_fair(max_t(u64, 2*se->slice, TICK_NSEC), se);
> > -     se->vlag = clamp(lag, -limit, limit);
> > +     se->vlag = limit_entity_lag(se, lag);
> >  }
> >
> >  /*
> > @@ -3757,6 +3764,7 @@ static void reweight_eevdf(struct cfs_rq *cfs_rq, struct sched_entity *se,
> >       if (avruntime != se->vruntime) {
> >               vlag = (s64)(avruntime - se->vruntime);
> >               vlag = div_s64(vlag * old_weight, weight);
> > +             vlag = limit_entity_lag(se, vlag);
> >               se->vruntime = avruntime - vlag;
> >       }
> >
> > @@ -3804,6 +3812,9 @@ static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se,
> >
> >       update_load_set(&se->load, weight);
> >
> > +     if (!se->on_rq)
> > +             se-vlag = limit_entity_lag(se, se->vlag);
> > +
> >  #ifdef CONFIG_SMP
> >       do {
> >               u32 divider = get_pelt_divider(&se->avg);
>
> > @@ -5237,6 +5258,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> >               if (WARN_ON_ONCE(!load))
> >                       load = 1;
> >               lag = div_s64(lag, load);
> > +             lag = limit_entity_lag(se, lag);
> >       }
> >
> >       se->vruntime = vruntime - lag;
> > --
> > 2.25.1
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ