linux-kernel - Re: [PATCH] smp: Allow smp_call_function_single

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191211162925.GD48697@xz-x1>
Date:   Wed, 11 Dec 2019 11:29:25 -0500
From:   Peter Xu <peterx@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Nadav Amit <namit@...are.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: [PATCH] smp: Allow smp_call_function_single_async() to insert
 locked csd

On Wed, Dec 11, 2019 at 04:40:58PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 04, 2019 at 03:48:23PM -0500, Peter Xu wrote:
> > Previously we will raise an warning if we want to insert a csd object
> > which is with the LOCK flag set, and if it happens we'll also wait for
> > the lock to be released.  However, this operation does not match
> > perfectly with how the function is named - the name with "_async"
> > suffix hints that this function should not block, while we will.
> > 
> > This patch changed this behavior by simply return -EBUSY instead of
> > waiting, at the meantime we allow this operation to happen without
> > warning the user to change this into a feature when the caller wants
> > to "insert a csd object, if it's there, just wait for that one".
> > 
> > This is pretty safe because in flush_smp_call_function_queue() for
> > async csd objects (where csd->flags&SYNC is zero) we'll first do the
> > unlock then we call the csd->func().  So if we see the csd->flags&LOCK
> > is true in smp_call_function_single_async(), then it's guaranteed that
> > csd->func() will be called after this smp_call_function_single_async()
> > returns -EBUSY.
> > 
> > Update the comment of the function too to refect this.
> > 
> > CC: Marcelo Tosatti <mtosatti@...hat.com>
> > CC: Thomas Gleixner <tglx@...utronix.de>
> > CC: Nadav Amit <namit@...are.com>
> > CC: Josh Poimboeuf <jpoimboe@...hat.com>
> > CC: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> > CC: Peter Zijlstra <peterz@...radead.org>
> > CC: linux-kernel@...r.kernel.org
> > Signed-off-by: Peter Xu <peterx@...hat.com>
> > ---
> > 
> > The story starts from a test where we've encountered the WARN_ON() on
> > a customized kernel and the csd_wait() took merely forever to
> > complete (so we've got a WARN_ON plusing a hang host).  The current
> > solution (which is downstream-only for now) is that from the caller's
> > side we use a boolean to store whether the csd is executed, we do:
> > 
> >   if (atomic_cmpxchg(&in_progress, 0, 1))
> >     smp_call_function_single_async(..);
> > 
> > While at the end of csd->func() we clear the bit.  However imho that's
> > mostly what csd->flags&LOCK is doing.  So I'm thinking maybe it would
> > worth it to raise this patch for upstream too so that it might help
> > other users of smp_call_function_single_async() when they need the
> > same semantic (and, I do think we shouldn't wait in _async()s...)
> 
> hrtick_start() seems to employ something similar.

True.  More "statistics" below.

> 
> > Signed-off-by: Peter Xu <peterx@...hat.com>
> 
> duplicate tag :-)

Oops, I'll remove that when I repost (of course, if at least any of
you would still like me to repost :).

> 
> > ---
> >  kernel/smp.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index 7dbcb402c2fc..dd31e8228218 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -329,6 +329,11 @@ EXPORT_SYMBOL(smp_call_function_single);
> >   * (ie: embedded in an object) and is responsible for synchronizing it
> >   * such that the IPIs performed on the @csd are strictly serialized.
> >   *
> > + * If the function is called with one csd which has not yet been
> > + * processed by previous call to smp_call_function_single_async(), the
> > + * function will return immediately with -EBUSY showing that the csd
> > + * object is still in progress.
> > + *
> >   * NOTE: Be careful, there is unfortunately no current debugging facility to
> >   * validate the correctness of this serialization.
> >   */
> > @@ -338,14 +343,17 @@ int smp_call_function_single_async(int cpu, call_single_data_t *csd)
> >  
> >  	preempt_disable();
> >  
> > -	/* We could deadlock if we have to wait here with interrupts disabled! */
> > -	if (WARN_ON_ONCE(csd->flags & CSD_FLAG_LOCK))
> > -		csd_lock_wait(csd);
> > +	if (csd->flags & CSD_FLAG_LOCK) {
> > +		err = -EBUSY;
> > +		goto out;
> > +	}
> >  
> >  	csd->flags = CSD_FLAG_LOCK;
> >  	smp_wmb();
> >  
> >  	err = generic_exec_single(cpu, csd, csd->func, csd->info);
> > +
> > +out:
> >  	preempt_enable();
> >  
> >  	return err;
> 
> Yes.. I think this will work.
> 
> I worry though; usage such as in __blk_mq_complete_request() /
> raise_blk_irq(), which explicitly clears csd.flags before calling it
> seems more dangerous than usual.
> 
> liquidio_napi_drv_callback() does that same.

This is also true.

Here's the statistics I mentioned:

=================================================

(1) Implemented the same counter mechanism on the caller's:

*** arch/mips/kernel/smp.c:
tick_broadcast[713]            smp_call_function_single_async(cpu, csd);
*** drivers/cpuidle/coupled.c:
cpuidle_coupled_poke[336]      smp_call_function_single_async(cpu, csd);
*** kernel/sched/core.c:
hrtick_start[298]              smp_call_function_single_async(cpu_of(rq), &rq->hrtick_csd);

(2) Cleared the csd flags before calls:

*** arch/s390/pci/pci_irq.c:
zpci_handle_fallback_irq[185]  smp_call_function_single_async(cpu, &cpu_data->csd);
*** block/blk-mq.c:
__blk_mq_complete_request[622] smp_call_function_single_async(ctx->cpu, &rq->csd);
*** block/blk-softirq.c:
raise_blk_irq[70]              smp_call_function_single_async(cpu, data);
*** drivers/net/ethernet/cavium/liquidio/lio_core.c:
liquidio_napi_drv_callback[735] smp_call_function_single_async(droq->cpu_id, csd);

(3) Others:

*** arch/mips/kernel/process.c:
raise_backtrace[713]           smp_call_function_single_async(cpu, csd);
*** arch/x86/kernel/cpuid.c:
cpuid_read[85]                 err = smp_call_function_single_async(cpu, &csd);
*** arch/x86/lib/msr-smp.c:
rdmsr_safe_on_cpu[182]         err = smp_call_function_single_async(cpu, &csd);
*** include/linux/smp.h:
bool[60]                       int smp_call_function_single_async(int cpu, call_single_data_t *csd);
*** kernel/debug/debug_core.c:
kgdb_roundup_cpus[272]         ret = smp_call_function_single_async(cpu, csd);
*** net/core/dev.c:
net_rps_send_ipi[5818]         smp_call_function_single_async(remsd->cpu, &remsd->csd);

=================================================

For (1): These probably justify more on that we might want a patch
         like this to avoid reimplementing it everywhere.

For (2): If I read it right, smp_call_function_single_async() is the
         only place where we take a call_single_data_t structure
         rather than the (smp_call_func_t, void *) tuple.  I could
         miss something important, but otherwise I think it would be
         good to use the tuple for smp_call_function_single_async() as
         well, then we move call_single_data_t out of global header
         but move into smp.c to avoid callers from toucing it (which
         could be error-prone).  In other words, IMHO it would be good
         to have all these callers fixed.

For (3): I didn't dig, but I think some of them (or future users)
         could still suffer from the same issue on retriggering the
         WARN_ON... 

Thoughts?

Thanks,

-- 
Peter Xu