linux-kernel - Re: [RFC][PATCH v2 08/10] smp,irq_work: Use the new irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200820134001.GV2674@hirez.programming.kicks-ass.net>
Date:   Thu, 20 Aug 2020 15:40:01 +0200
From:   peterz@...radead.org
To:     Christoph Hellwig <hch@....de>
Cc:     mingo@...nel.org, torvalds@...ux-foundation.org,
        linux-kernel@...r.kernel.org, will@...nel.org, paulmck@...nel.org,
        axboe@...nel.dk, chris@...is-wilson.co.uk, davem@...emloft.net,
        kuba@...nel.org, fweisbec@...il.com, oleg@...hat.com,
        vincent.guittot@...aro.org
Subject: Re: [RFC][PATCH v2 08/10] smp,irq_work: Use the new irq_work API

On Thu, Aug 20, 2020 at 08:19:27AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 12:51:10PM +0200, Peter Zijlstra wrote:
> >  	if (blk_mq_complete_need_ipi(rq)) {
> > -		INIT_CSD(&rq->csd, __blk_mq_complete_request_remote, rq);
> > -		smp_call_function_single_async(rq->mq_ctx->cpu, &rq->csd);
> > +		rq->work = IRQ_WORK_INIT_HARD(__blk_mq_complete_request_remote);
> > +		irq_work_queue_remote_static(rq->mq_ctx->cpu, &rq->work);
> 
> So given the caller synchronization / use once semantics does it even
> make sense to split the init vs call part here?  What about:
> 
> 		irq_work_queue_remote_static(&rq->work, rq->mq_ctx->cpu,
> 					    __blk_mq_complete_request_remote);
> 
> instead?  And btw, I'm not sure what the "static" stand for.  Maybe
> irq_work_queue_remote_once?

The 'static' came from static storage, but I agree that the naming is
pretty random/poor.

One argument against your proposal is that it makes it even harder to
add warnings that try and catch bad/broken usage.

Also, given Linus' email I now wonder if we still want the
irq_work_remote variant at all.

So the initial use-case was something like:

struct foo {
	struct irq_work work;
	...
};

void foo_func(struct irq_work *work)
{
	struct foo *f = container_of(work, struct foo, work);

	...
}

DEFINE_PER_CPU(struct foo, foo) = IRQ_WORK_INIT_HARD(foo_func);

void raise_foo(int cpu)
{
	irq_work_queue_remote(per_cpu_ptr(&foo, cpu), cpu);
}

Where you can, with IRQs disabled, call raise_foo(cpu) for a remote CPU
and have the guarantee that foo_func() will observe whatever you did
before calling raise_foo().

Specifically, I needed this for what is now __ttwu_queue_wakelist(),
which used to rely on smp_send_reschedule() but needed to be pulled out
of the regular scheduler IPI.

While sorting through the wreckage of me getting this horribly wrong, I
noticed that generic_smp_call_function_single_interrupt() was
unconditionally loading _2_ cachelines, one for the regular CSD llist
and one for the remote irq_work llist.

I then realized we could merge those two lists, and regain the original
intent of that IPI to only touch one line.

At that point I could build the above, but then I realized that since I
already had a mixed type list, I could put the ttwu entries on it as
well, which is cheaper than doing the above.

Anyway, tl;dr, what do we actually want? Do we favour the embedded
irq_work variant over smp_call_function_single_asyn() ?

There's a few subtle differences, where smp_call_function_single_async()
will directly call @func when @cpu == smp_processor_id(),
irq_work_remote will give you WARN -- since irq_work to the local CPU is
defined as a self-IPI, which isn't implemented on all architectures and
is a distinctly different behaviour.

That said, most (if not all) users seem to actually only care about
running things on another CPU, so that seems to not matter (much).