linux-kernel - Re: [PATCH v1 5/5] PM: sleep: Spread out async kworker threads during dpm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx9d0DuxHyuvH4e3znHUdmxMCjih8NWSabjqDqJ+TXmduQ@mail.gmail.com>
Date: Fri, 15 Nov 2024 10:33:32 -0800
From: Saravana Kannan <saravanak@...gle.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Pavel Machek <pavel@....cz>, Len Brown <len.brown@...el.com>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, Geert Uytterhoeven <geert@...ux-m68k.org>, Marek Vasut <marex@...x.de>, 
	Bird@...gle.com, Tim <Tim.Bird@...y.com>, kernel-team@...roid.com, 
	linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 5/5] PM: sleep: Spread out async kworker threads during
 dpm_resume*() phases

On Fri, Nov 15, 2024 at 8:13 AM Vincent Guittot
<vincent.guittot@...aro.org> wrote:
>
> On Thu, 14 Nov 2024 at 23:09, Saravana Kannan <saravanak@...gle.com> wrote:
> >
> > As of today, the scheduler doesn't spread out all the kworker threads
> > across all the available CPUs during suspend/resume. This causes
> > significant resume latency during the dpm_resume*() phases.
> >
> > System resume latency is a very user-visible event. Reducing the
> > latency is more important than trying to be energy aware during that
> > period.
> >
> > Since there are no userspace processes running during this time and
> > this is a very short time window, we can simply disable EAS during
> > resume so that the parallel resume of the devices is spread across all
> > the CPUs.
> >
> > On a Pixel 6, averaging over 100 suspend/resume cycles, the new logic
> > plus disabling EAS for resume yields significant improvements:
> > +---------------------------+-----------+------------+------------------+
> > | Phase                     | Old full sync | New full async | % change |
> > |                           |               | + EAS disabled |          |
> > +---------------------------+-----------+------------+------------------+
> > | Total dpm_suspend*() time |        107 ms |          62 ms |     -42% |
> > +---------------------------+-----------+------------+------------------+
> > | Total dpm_resume*() time  |         75 ms |          61 ms |     -19% |
> > +---------------------------+-----------+------------+------------------+
> > | Sum                       |        182 ms |         123 ms |     -32% |
> > +---------------------------+-----------+------------+------------------+
>
> in cover letter you have figures for
>  - Old full sync
>  - New full async
>  - New full async  + EAS disabled
>
> you should better use the figures for  New full async vs New full
> async  + EAS disabled to show EAS disabled impact

I do give those numbers in the commit text of each patch making the changes.

Patch 4 commit text shows how it's improving things compared to the
older logic full sync (this is the baseline) - resume is 1% faster.
Patch 5 commit text shows you how disabling EAS is improving numbers
compared to baseline - resume 19% faster.

So, yeah, all the numbers are there in one of these emails. Patch 5
(which is the only one touching EAS) is the one that has the
comparison you are asking for.

> I would be interested to get figures about the impact of disabling it
> during full suspend sequence as I'm not convince that it's worth the
> complexity especially with fix OPP during suspend

1. Device suspend actually got worse by 5ms or so. I already provided that.

2. As I said in the Patch 5, suspend is more about reducing the energy
going into suspend. It's a balance of how quick you can be to how much
power you use to be quick. So, disabling EAS across all of
suspend/resume will have a huge impact on power because userspace is
still running, there are a ton of threads and userspace could get
preempted between disabling suspend and kicking off suspend. Lots of
obvious power concerns overall.

Thanks,
Saravana

>
> >
> > Signed-off-by: Saravana Kannan <saravanak@...gle.com>
> > ---
> >  kernel/power/suspend.c  | 16 ++++++++++++++++
> >  kernel/sched/topology.c | 13 +++++++++++++
> >  2 files changed, 29 insertions(+)
> >
> > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> > index 09f8397bae15..7304dc39958f 100644
> > --- a/kernel/power/suspend.c
> > +++ b/kernel/power/suspend.c
> > @@ -393,6 +393,12 @@ void __weak arch_suspend_enable_irqs(void)
> >         local_irq_enable();
> >  }
> >
> > +/*
> > + * Intentionally not part of a header file to avoid risk of abuse by other
> > + * drivers.
> > + */
> > +void sched_set_energy_aware(unsigned int enable);
> > +
> >  /**
> >   * suspend_enter - Make the system enter the given sleep state.
> >   * @state: System sleep state to enter.
> > @@ -468,6 +474,15 @@ static int suspend_enter(suspend_state_t state, bool *wakeup)
> >
> >   Platform_wake:
> >         platform_resume_noirq(state);
> > +       /*
> > +        * We do this only for resume instead of suspend and resume for these
> > +        * reasons:
> > +        * - Performance is more important than power for resume.
> > +        * - Power spent entering suspend is more important for suspend. Also,
> > +        *   stangely, disabling EAS was making suspent a few milliseconds
> > +        *   slower in my testing.
> > +        */
> > +       sched_set_energy_aware(0);
> >         dpm_resume_noirq(PMSG_RESUME);
> >
> >   Platform_early_resume:
> > @@ -520,6 +535,7 @@ int suspend_devices_and_enter(suspend_state_t state)
> >   Resume_devices:
> >         suspend_test_start();
> >         dpm_resume_end(PMSG_RESUME);
> > +       sched_set_energy_aware(1);
>
> If we end up having a special scheduling mode during suspend, we
> should make the function more generic and not only EAS/ smartphone
> specific
>
> Like a sched_suspend and sched_resume
>
> >         suspend_test_finish("resume devices");
> >         trace_suspend_resume(TPS("resume_console"), state, true);
> >         resume_console();
> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > index 9748a4c8d668..c069c0b17cbf 100644
> > --- a/kernel/sched/topology.c
> > +++ b/kernel/sched/topology.c
> > @@ -284,6 +284,19 @@ void rebuild_sched_domains_energy(void)
> >         mutex_unlock(&sched_energy_mutex);
> >  }
> >
> > +void sched_set_energy_aware(unsigned int enable)
>
> This is a copy/paste of sched_energy_aware_handler() below, we should
> have 1 helper for both
>
> > +{
> > +       int state;
> > +
> > +       if (!sched_is_eas_possible(cpu_active_mask))
> > +               return;
> > +
> > +       sysctl_sched_energy_aware = enable;
> > +       state = static_branch_unlikely(&sched_energy_present);
> > +       if (state != sysctl_sched_energy_aware)
> > +               rebuild_sched_domains_energy();
> > +}
> > +
> >  #ifdef CONFIG_PROC_SYSCTL
> >  static int sched_energy_aware_handler(const struct ctl_table *table, int write,
> >                 void *buffer, size_t *lenp, loff_t *ppos)
> > --
> > 2.47.0.338.g60cca15819-goog
> >