[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230913005647.1534747-1-Liam.Howlett@oracle.com>
Date: Tue, 12 Sep 2023 20:56:47 -0400
From: "Liam R. Howlett" <Liam.Howlett@...cle.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: maple-tree@...ts.infradead.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
stable@...r.kernel.org, Geert Uytterhoeven <geert@...ux-m68k.org>,
"Paul E. McKenney" <paulmck@...nel.org>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Andreas Schwab <schwab@...ux-m68k.org>,
Matthew Wilcox <willy@...radead.org>,
Peng Zhang <zhangpeng.00@...edance.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
"Mike Rapoport (IBM)" <rppt@...nel.org>,
Vlastimil Babka <vbabka@...e.cz>
Subject: [PATCH] init/main: Clear boot task idle flag
Initial booting is setting the task flag to idle (PF_IDLE) by the call
path sched_init() -> init_idle(). Having the task idle and calling
call_rcu() in kernel/rcu/tiny.c means that TIF_NEED_RESCHED will be
set. Subsequent calls to any cond_resched() will enable IRQs,
potentially earlier than the IRQ setup has completed. Recent changes
have caused just this scenario and IRQs have been enabled early.
This causes a warning later in start_kernel() as interrupts are enabled
before they are fully set up.
Fix this issue by clearing the PF_IDLE flag on return from sched_init()
and restore the flag in rest_init(). Although the boot task was marked
as idle since (at least) d80e4fda576d, I am not sure that it is wrong to
do so. The forced context-switch on idle task was introduced in the
tiny_rcu update, so I'm going to claim this fixes 5f6130fa52ee.
Link: https://lore.kernel.org/linux-mm/87v8cv22jh.fsf@mail.lhotse/
Link: https://lore.kernel.org/linux-mm/CAMuHMdWpvpWoDa=Ox-do92czYRvkok6_x6pYUH+ZouMcJbXy+Q@mail.gmail.com/
Fixes: 5f6130fa52ee ("tiny_rcu: Directly force QS when call_rcu_[bh|sched]() on idle_task")
Cc: stable@...r.kernel.org
Cc: Geert Uytterhoeven <geert@...ux-m68k.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>
Cc: Andreas Schwab <schwab@...ux-m68k.org>
Cc: Matthew Wilcox <willy@...radead.org>
Cc: Peng Zhang <zhangpeng.00@...edance.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: "Mike Rapoport (IBM)" <rppt@...nel.org>
Cc: Vlastimil Babka <vbabka@...e.cz>
Signed-off-by: Liam R. Howlett <Liam.Howlett@...cle.com>
---
init/main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/init/main.c b/init/main.c
index ad920fac325c..f74772acf612 100644
--- a/init/main.c
+++ b/init/main.c
@@ -696,7 +696,7 @@ noinline void __ref __noreturn rest_init(void)
*/
rcu_read_lock();
tsk = find_task_by_pid_ns(pid, &init_pid_ns);
- tsk->flags |= PF_NO_SETAFFINITY;
+ tsk->flags |= PF_NO_SETAFFINITY | PF_IDLE;
set_cpus_allowed_ptr(tsk, cpumask_of(smp_processor_id()));
rcu_read_unlock();
@@ -938,6 +938,8 @@ void start_kernel(void)
* time - but meanwhile we still have a functioning scheduler.
*/
sched_init();
+ /* Avoid early context switch, rest_init() restores PF_IDLE */
+ current->flags &= ~PF_IDLE;
if (WARN(!irqs_disabled(),
"Interrupts were enabled *very* early, fixing it\n"))
--
2.39.2
Powered by blists - more mailing lists