[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220510184213.l3gjweeleyg7obca@treble>
Date: Tue, 10 May 2022 11:42:13 -0700
From: Josh Poimboeuf <jpoimboe@...nel.org>
To: Rik van Riel <riel@...com>
Cc: "song@...nel.org" <song@...nel.org>,
"joe.lawrence@...hat.com" <joe.lawrence@...hat.com>,
Song Liu <songliubraving@...com>,
"peterz@...radead.org" <peterz@...radead.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
"pmladek@...e.com" <pmladek@...e.com>,
"live-patching@...r.kernel.org" <live-patching@...r.kernel.org>,
Kernel Team <Kernel-team@...com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"jpoimboe@...hat.com" <jpoimboe@...hat.com>
Subject: Re: [RFC] sched,livepatch: call klp_try_switch_task in __cond_resched
On Tue, May 10, 2022 at 06:07:00PM +0000, Rik van Riel wrote:
> On Tue, 2022-05-10 at 09:52 -0700, Josh Poimboeuf wrote:
> > On Tue, May 10, 2022 at 04:07:42PM +0000, Rik van Riel wrote:
> > > >
> > > Now I wonder if we could just hook up a preempt notifier
> > > for kernel live patches. All the distro kernels already
> > > need the preempt notifier for KVM, anyway...
> > >
> >
> > I wouldn't be opposed to that, but how does it solve this problem?
> > If
> > as Peter said cond_resched() can be a NOP, then preemption would have
> > to
> > be from an interrupt, in which case frame pointers aren't reliable.
> >
> The systems where we are seeing problems do not, as far
> as I know, throw softlockup errors, so the kworker
> threads that fail to transition to the new KLP version
> are sleeping and getting scheduled out at times.
Are they sleeping due to an explicit call to cond_resched()?
> A KLP transition preempt notifier would help those
> kernel threads transition to the new KLP version at
> any time they reschedule.
... unless cond_resched() is a no-op due to CONFIG_PREEMPT?
> How much it will help is hard to predict, but I should
> be able to get results from a fairly large sample size
> of systems within a few weeks :)
As Peter said, keep in mind that we will need to fix other cases beyond
Facebook, i.e., CONFIG_PREEMPT combined with non-x86 arches which don't
have ORC so they can't reliably unwind from an IRQ.
--
Josh
Powered by blists - more mailing lists