linux-kernel - Re: [PATCH] smp: Allow smp_call_function_single

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191217153128.GB7258@xz-x1>
Date:   Tue, 17 Dec 2019 10:31:28 -0500
From:   Peter Xu <peterx@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Nadav Amit <namit@...are.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH] smp: Allow smp_call_function_single_async() to insert
 locked csd

On Tue, Dec 17, 2019 at 10:51:56AM +0100, Peter Zijlstra wrote:
> On Mon, Dec 16, 2019 at 03:58:33PM -0500, Peter Xu wrote:
> > On Mon, Dec 16, 2019 at 09:37:05PM +0100, Peter Zijlstra wrote:
> > > On Wed, Dec 11, 2019 at 11:29:25AM -0500, Peter Xu wrote:
> 
> > > > (3) Others:
> > > > 
> > > > *** arch/mips/kernel/process.c:
> > > > raise_backtrace[713]           smp_call_function_single_async(cpu, csd);
> > > 
> > > per-cpu csd data, seems perfectly fine usage.
> > 
> > I'm not sure whether I get the point, I just feel like it could still
> > trigger as long as we do it super fast, before IPI handled,
> > disregarding whether it's per-cpu csd or not.
> 
> No, I wasn't paying attention last night. I'm thinking this one might
> maybe be in 1). It does the state check using that bitmap.

Indeed.  Though I'm not very certain to change this one too, since I'm
not sure whether that pr_warn is really intended:

        if (cpumask_test_and_set_cpu(cpu, &backtrace_csd_busy)) {
                pr_warn("Unable to send backtrace IPI to CPU%u - perhaps it hung?\n",
                        cpu);
                continue;
        }

I mean, that should depend on if it can really hang somehow (or it's
the same issue as what we're trying to fix)...  If it won't hang, then
it should be safe I think, and this pr_warn could be helpless after all.

> 
> > > > *** arch/x86/kernel/cpuid.c:
> > > > cpuid_read[85]                 err = smp_call_function_single_async(cpu, &csd);
> > > > *** arch/x86/lib/msr-smp.c:
> > > > rdmsr_safe_on_cpu[182]         err = smp_call_function_single_async(cpu, &csd);
> > > 
> > > These two have csd on stack and wait with a completion. seems fine.
> > 
> > Yeh this is true, then I'm confused why they don't use the sync()
> > helpers..
> 
> I suspect to be nice for virt. Both CPUID and MSR accesses can trap. but
> now I'm confused, because it is mostly WRMSR that traps.
> 
> Anyway, see the commit here: 07cde313b2d2 ("x86/msr: Allow rdmsr_safe_on_cpu() to schedule")

Yes that makes sense.  Thanks for the pointer.

However, then my next confusion is why they can't provide a common
solution to the smp code again... I feel like it could be even easier
(please see below).  I'm not very familiar with smp code yet, but if
it works it should benefit all callers imho.

diff --git a/kernel/smp.c b/kernel/smp.c
index dd31e8228218..7a1b163d1e4b 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -307,11 +307,12 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
 
        err = generic_exec_single(cpu, csd, func, info);
 
+       put_cpu();
+
+       /* If wait, csd is on stack so it's safe without get_cpu() */
        if (wait)
                csd_lock_wait(csd);
 
-       put_cpu();
-
        return err;
 }
 EXPORT_SYMBOL(smp_call_function_single);

Thanks,

-- 
Peter Xu