lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 Sep 2023 09:18:28 -0400
From:   "Liam R. Howlett" <Liam.Howlett@...cle.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        maple-tree@...ts.infradead.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org,
        Geert Uytterhoeven <geert@...ux-m68k.org>,
        Christophe Leroy <christophe.leroy@...roup.eu>,
        Andreas Schwab <schwab@...ux-m68k.org>,
        Matthew Wilcox <willy@...radead.org>,
        Peng Zhang <zhangpeng.00@...edance.com>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        "Mike Rapoport (IBM)" <rppt@...nel.org>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH] init/main: Clear boot task idle flag

* Paul E. McKenney <paulmck@...nel.org> [230913 07:28]:
> On Wed, Sep 13, 2023 at 01:01:39PM +0200, Peter Zijlstra wrote:
> > On Tue, Sep 12, 2023 at 08:56:47PM -0400, Liam R. Howlett wrote:
> > > Initial booting is setting the task flag to idle (PF_IDLE) by the call
> > > path sched_init() -> init_idle().  Having the task idle and calling
> > > call_rcu() in kernel/rcu/tiny.c means that TIF_NEED_RESCHED will be
> > > set.  Subsequent calls to any cond_resched() will enable IRQs,
> > > potentially earlier than the IRQ setup has completed.  Recent changes
> > > have caused just this scenario and IRQs have been enabled early.
> > > 
> > > This causes a warning later in start_kernel() as interrupts are enabled
> > > before they are fully set up.
> > > 
> > > Fix this issue by clearing the PF_IDLE flag on return from sched_init()
> > > and restore the flag in rest_init().  Although the boot task was marked
> > > as idle since (at least) d80e4fda576d, I am not sure that it is wrong to
> > > do so.  The forced context-switch on idle task was introduced in the
> > > tiny_rcu update, so I'm going to claim this fixes 5f6130fa52ee.
> > > 
> > > Link: https://lore.kernel.org/linux-mm/87v8cv22jh.fsf@mail.lhotse/
> > > Link: https://lore.kernel.org/linux-mm/CAMuHMdWpvpWoDa=Ox-do92czYRvkok6_x6pYUH+ZouMcJbXy+Q@mail.gmail.com/
> > > Fixes: 5f6130fa52ee ("tiny_rcu: Directly force QS when call_rcu_[bh|sched]() on idle_task")
> > > Cc: stable@...r.kernel.org
> > > Cc: Geert Uytterhoeven <geert@...ux-m68k.org>
> > > Cc: "Paul E. McKenney" <paulmck@...nel.org>
> > > Cc: Christophe Leroy <christophe.leroy@...roup.eu>
> > > Cc: Andreas Schwab <schwab@...ux-m68k.org>
> > > Cc: Matthew Wilcox <willy@...radead.org>
> > > Cc: Peng Zhang <zhangpeng.00@...edance.com>
> > > Cc: Peter Zijlstra <peterz@...radead.org>
> > > Cc: Ingo Molnar <mingo@...hat.com>
> > > Cc: Juri Lelli <juri.lelli@...hat.com>
> > > Cc: Vincent Guittot <vincent.guittot@...aro.org>
> > > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > > Cc: "Mike Rapoport (IBM)" <rppt@...nel.org>
> > > Cc: Vlastimil Babka <vbabka@...e.cz>
> > > Signed-off-by: Liam R. Howlett <Liam.Howlett@...cle.com>
> > > ---
> > >  init/main.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/init/main.c b/init/main.c
> > > index ad920fac325c..f74772acf612 100644
> > > --- a/init/main.c
> > > +++ b/init/main.c
> > > @@ -696,7 +696,7 @@ noinline void __ref __noreturn rest_init(void)
> > >  	 */
> > >  	rcu_read_lock();
> > >  	tsk = find_task_by_pid_ns(pid, &init_pid_ns);
> > > -	tsk->flags |= PF_NO_SETAFFINITY;
> > > +	tsk->flags |= PF_NO_SETAFFINITY | PF_IDLE;
> > >  	set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id()));
> > >  	rcu_read_unlock();
> > >  
> > > @@ -938,6 +938,8 @@ void start_kernel(void)
> > >  	 * time - but meanwhile we still have a functioning scheduler.
> > >  	 */
> > >  	sched_init();
> > > +	/* Avoid early context switch, rest_init() restores PF_IDLE */
> > > +	current->flags &= ~PF_IDLE;
> > >  
> > >  	if (WARN(!irqs_disabled(),
> > >  		 "Interrupts were enabled *very* early, fixing it\n"))
> > 
> > Hurmph... so since this is about IRQs, would it not make sense to have
> > the | PF_IDLE near 'early_boot_irqs_disabled = false' ?

I was thinking that this isn't an idle thread until the end of boot, so
leave it set as not idle until the end.

> > 
> > Or, alternatively, make the tinyrcu thing check that variable?
> 
> We could do that, but do we really the decidedly non-idle early boot
> sequence designated as idle?

call_rcu() tiny is called more than this code, so unless there is
another reason we want to check the IRQ status it seemed best to change
the boot task flag.  I mean, I think the is_idle_task() check is valid
in every other context, right?

> 
> What surprises me is that this is just now showing up.  The ingredients
> for this one have been in place for almost 10 years.
> 

Am I just that lucky?

Cheers,
Liam

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ