[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191217153128.GB7258@xz-x1>
Date: Tue, 17 Dec 2019 10:31:28 -0500
From: Peter Xu <peterx@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org,
Marcelo Tosatti <mtosatti@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Nadav Amit <namit@...are.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH] smp: Allow smp_call_function_single_async() to insert
locked csd
On Tue, Dec 17, 2019 at 10:51:56AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 16, 2019 at 03:58:33PM -0500, Peter Xu wrote:
> > On Mon, Dec 16, 2019 at 09:37:05PM +0100, Peter Zijlstra wrote:
> > > On Wed, Dec 11, 2019 at 11:29:25AM -0500, Peter Xu wrote:
>
> > > > (3) Others:
> > > >
> > > > *** arch/mips/kernel/process.c:
> > > > raise_backtrace[713] smp_call_function_single_async(cpu, csd);
> > >
> > > per-cpu csd data, seems perfectly fine usage.
> >
> > I'm not sure whether I get the point, I just feel like it could still
> > trigger as long as we do it super fast, before IPI handled,
> > disregarding whether it's per-cpu csd or not.
>
> No, I wasn't paying attention last night. I'm thinking this one might
> maybe be in 1). It does the state check using that bitmap.
Indeed. Though I'm not very certain to change this one too, since I'm
not sure whether that pr_warn is really intended:
if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {
pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",
cpu);
continue;
}
I mean, that should depend on if it can really hang somehow (or it's
the same issue as what we're trying to fix)... If it won't hang, then
it should be safe I think, and this pr_warn could be helpless after all.
>
> > > > *** arch/x86/kernel/cpuid.c:
> > > > cpuid_read[85] err = smp_call_function_single_async(cpu, &csd);
> > > > *** arch/x86/lib/msr-smp.c:
> > > > rdmsr_safe_on_cpu[182] err = smp_call_function_single_async(cpu, &csd);
> > >
> > > These two have csd on stack and wait with a completion. seems fine.
> >
> > Yeh this is true, then I'm confused why they don't use the sync()
> > helpers..
>
> I suspect to be nice for virt. Both CPUID and MSR accesses can trap. but
> now I'm confused, because it is mostly WRMSR that traps.
>
> Anyway, see the commit here: 07cde313b2d2 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule")
Yes that makes sense. Thanks for the pointer.
However, then my next confusion is why they can't provide a common
solution to the smp code again... I feel like it could be even easier
(please see below). I'm not very familiar with smp code yet, but if
it works it should benefit all callers imho.
diff --git a/kernel/smp.c b/kernel/smp.c
index dd31e8228218..7a1b163d1e4b 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -307,11 +307,12 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
err = generic_exec_single(cpu, csd, func, info);
+ put_cpu();
+
+ /* If wait, csd is on stack so it's safe without get_cpu() */
if (wait)
csd_lock_wait(csd);
- put_cpu();
-
return err;
}
EXPORT_SYMBOL(smp_call_function_single);
Thanks,
--
Peter Xu
Powered by blists - more mailing lists