linux-kernel - Re: [RFC PATCH] poll(): add poll_wait_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101007170716.GB799@Krystal>
Date:	Thu, 7 Oct 2010 13:07:16 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Christoph Hellwig <hch@....de>, Li Zefan <lizf@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Johannes Berg <johannes.berg@...el.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Tom Zanussi <tzanussi@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andi Kleen <andi@...stfloor.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH] poll(): add poll_wait_set_exclusive()

* Steven Rostedt (rostedt@...dmis.org) wrote:
> On Wed, 2010-10-06 at 15:04 -0400, Mathieu Desnoyers wrote:
> 
> > For reference, here is the use-case: The user-space daemon runs typically one
> > thread per cpu, each with a handle on many file descriptors. Each thread waits
> > for data to be available using poll(). In order to follow the poll semantic,
> > when data becomes available on a file descriptor, the kernel wakes up all
> > threads at once, but in my case only one of them will successfully consume the
> > data (all other thread's splice or read will fail with -ENODATA). With many
> > threads, these useless wakeups add an unwanted overhead and scalability
> > limitation.
> 
> Mathieu, I'm curious to why you have multiple threads reading the same
> fd. Since the threads are per cpu, does the fd handle all CPUs?

The fd is local to a single ring buffer (which is per-cpu, transporting a group
of events). The threads consuming the file descriptors are approximately per
cpu, modulo cpu hotplug events, user preferences, etc. I would prefer not to
make that a strong 1-1 mapping (with affinity and all), because a typical
tracing scenario is that a single CPU is heavily used by the OS (thus producing
trace data), while other CPUs are idle, available to pull the data from the
buffers. Therefore, I strongly prefer not to affine reader threads to their
"local" buffers in the general case. That being said, it could be kept as an
option, since it might make sense in some other use-cases, especially with tiny
buffers, where it makes sense to keep locality of reference in the L2 cache.

> Or do you have an fd per event per CPU, in which case the threads should just
> poll off of their own fds.

I have one fd per per-cpu buffer, but there can be many per-cpu buffers, each
transporting a group of events. Therefore, I don't want to associate one thread
per event group, because this would be a resource waste.  Typically, only a few
per-cpu buffers will be very active, and others will be very quiet.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/