linux-kernel - Re: [RFC PATCH] poll(): add poll_wait_set

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1286473864.6661.16.camel@gandalf.stny.rr.com>
Date:	Thu, 07 Oct 2010 13:51:04 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...e.hu>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Christoph Hellwig <hch@....de>, Li Zefan <lizf@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Johannes Berg <johannes.berg@...el.com>,
	Masami Hiramatsu <masami.hiramatsu.pt@...achi.com>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Tom Zanussi <tzanussi@...il.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andi Kleen <andi@...stfloor.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [RFC PATCH] poll(): add poll_wait_set_exclusive()

On Thu, 2010-10-07 at 13:07 -0400, Mathieu Desnoyers wrote:
> * Steven Rostedt (rostedt@...dmis.org) wrote:
> > On Wed, 2010-10-06 at 15:04 -0400, Mathieu Desnoyers wrote:
> > 
> > > For reference, here is the use-case: The user-space daemon runs typically one
> > > thread per cpu, each with a handle on many file descriptors. Each thread waits
> > > for data to be available using poll(). In order to follow the poll semantic,
> > > when data becomes available on a file descriptor, the kernel wakes up all
> > > threads at once, but in my case only one of them will successfully consume the
> > > data (all other thread's splice or read will fail with -ENODATA). With many
> > > threads, these useless wakeups add an unwanted overhead and scalability
> > > limitation.
> > 
> > Mathieu, I'm curious to why you have multiple threads reading the same
> > fd. Since the threads are per cpu, does the fd handle all CPUs?
> 
> The fd is local to a single ring buffer (which is per-cpu, transporting a group
> of events). The threads consuming the file descriptors are approximately per
> cpu, modulo cpu hotplug events, user preferences, etc. I would prefer not to
> make that a strong 1-1 mapping (with affinity and all), because a typical
> tracing scenario is that a single CPU is heavily used by the OS (thus producing
> trace data), while other CPUs are idle, available to pull the data from the
> buffers. Therefore, I strongly prefer not to affine reader threads to their
> "local" buffers in the general case. That being said, it could be kept as an
> option, since it might make sense in some other use-cases, especially with tiny
> buffers, where it makes sense to keep locality of reference in the L2 cache.

I never mention affinity. As with trace-cmd, it assigns a process per
CPU, but those processes can be on any CPU that the scheduler chooses. I
could probably do it with a single process reading all the CPU fds too.
I might add that as an option.

> 
> > Or do you have an fd per event per CPU, in which case the threads should just
> > poll off of their own fds.
> 
> I have one fd per per-cpu buffer, but there can be many per-cpu buffers, each
> transporting a group of events. Therefore, I don't want to associate one thread
> per event group, because this would be a resource waste.  Typically, only a few
> per-cpu buffers will be very active, and others will be very quiet.

Lets not talk about threads, what about fds? I'm wondering why you have
many threads on the same fd?

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/