linux-kernel - Re: [RFC] observe and act upon workload parallelism: PERF_TYPE_PARALLELISM (Was: [RFC][PATCH] sched_wait

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c76f371a0911161113v60eef516qee0a1a9cf99d2ae@mail.gmail.com>
Date:	Mon, 16 Nov 2009 20:13:20 +0100
From:	Stijn Devriendt <highguy@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Mike Galbraith <efault@....de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Andrea Arcangeli <andrea@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	peterz@...radead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC] observe and act upon workload parallelism: 
	PERF_TYPE_PARALLELISM (Was: [RFC][PATCH] sched_wait_block: wait for blocked 
	threads)

> It should not be limited to a single task, and it should work with
> existing syscall APIs - i.e. be fd based.
>
> Incidentally we already have a syscall and a kernel subsystem that is
> best suited to deal with such types of issues: perf events. I think we
> can create a new, special performance event type that observes
> task/workload (or CPU) parallelism:
>
>        PERF_TYPE_PARALLELISM
>
> With a 'parallelism_threshold' attribute. (which is '1' for a single
> task. See below.)

On one side this looks like it's exactly where it belongs as you're
monitoring performance to keep it up to speed, but it does make
the userspace component depend on a profiling-oriented optional
kernel interface.

>
> And then we can use poll() in the thread manager task to observe PIDs,
> workloads or full CPUs. The poll() implementation of perf events is fast
> and scalable.

I've had a quick peek at the perf code and how it currently hooks into
the scheduler and at first glance it looks like 2 additional context switches
are required when using perf. The scheduler will first schedule the idle
thread to later find out that the schedule tail woke up another process
to run. My initial solution woke up the process before making a
scheduling decision. Depending on context switch times the original
blocking operation may have been unblocked (especially on SMP);
e.g. a blocked user-space mutex which was held shortly.
Feel free to correct me here as it was merely a quick peek.

>
> This would make a very powerful task queueing framework. It basically
> allows a 'lazy' user-space scheduler, which only activates if the kernel
> scheduler has run out of work.
>
> What do you think?
>
>        Ingo

I definately like the way this approach can also work globally.

Stijn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/