linux-kernel - Re: [PATCH -v3] use per cpu data for single cpu ipi calls

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20090131084426.GU30821@kernel.dk>
Date:	Sat, 31 Jan 2009 09:44:27 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Rusty Russell <rusty@...tcorp.com.au>, npiggin@...e.de,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [PATCH -v3] use per cpu data for single cpu ipi calls

On Fri, Jan 30 2009, Peter Zijlstra wrote:
> > If another CPU hasn't even received its IPI before the same CPU sends the 
> > next one, I'm not sure we _want_ to send one, in fact.
> 
> I think the intent was to re-route IO-completion interrupts to whatever
> cpu/node issued the IO with the idea that that cpu/node has the page
> hottest etc. and transferring the completion is cheaper than bouncing
> the page.

Correct

> Since that would be relaying hardware interrupts, there's nothing much
> you can do about the rate, or something, that's up to the firmware on
> $$$ scsi thing.
> 
> But Jens already said that that path was using the __ variant and
> providing its own csds, the kmalloc isn't needed there, so it might all
> be moot.

In fact the block layer already does attempt to do what Linus describes.
We queue the events for the target cpu, and then do:

        local_irq_save(flags);
        list = &__get_cpu_var(blk_cpu_done);
        list_add_tail(&rq->csd.list, list);

        if (list->next == &rq->csd.list)
                raise_softirq_irqoff(BLOCK_SOFTIRQ);

thus only triggering a new softirq interrupt, if the preceeding one
hasn't run already. So this is done for the block layer
trigger_softirq() part, but could be provided by the lower layer as well
instead.

> > But that's a secondary issue, and isn't a correctness thing, just a "do we 
> > really need three different allocations?" musing..
> 
> Nick, Jens, I was under the presumption that the kmalloc was needed for
> something other than failing to deadlock, happen to remember what?

As far as I remember, it was just the way to allocate memory for the
non-wait case. The per-cpu single csd will limit you to a single pending
entry on the cpu queue, you could have more (like the block layer will
do) and get a nice batching effect for ipi busy workloads instead of a
1:1 mapping between work and ipi's fired.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/