linux-kernel - Re: [PATCH 4/6] workqueue: Automatically mark CPU-hogging work items CPU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230510150946.GO4253@hirez.programming.kicks-ass.net>
Date:   Wed, 10 May 2023 17:09:46 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Tejun Heo <tj@...nel.org>
Cc:     jiangshanlai@...il.com, torvalds@...ux-foundation.org,
        linux-kernel@...r.kernel.org, kernel-team@...a.com,
        Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH 4/6] workqueue: Automatically mark CPU-hogging work items
 CPU_INTENSIVE

On Tue, May 09, 2023 at 05:07:50PM -1000, Tejun Heo wrote:

> @@ -976,24 +981,54 @@ void notrace wq_worker_stopping(struct task_struct *task, bool voluntary)
>  
>  	pool = worker->pool;
>  
> -	/* Return if preempted before wq_worker_running() was reached */
> -	if (worker->sleeping)
> -		return;
> +	if (!voluntary || task_is_running(task)) {
> +		/*
> +		 * Concurrency-managed @worker is still RUNNING. See if the
> +		 * current work is hogging CPU stalling other per-cpu work
> +		 * items. If so, mark @worker CPU_INTENSIVE to exclude it from
> +		 * concurrency management. @worker->current_* are stable as they
> +		 * can only be modified by @task which is %current.
> +		 */
> +		if (!worker->current_work ||
> +		    task->se.sum_exec_runtime - worker->current_at <
> +		    wq_cpu_intensive_thresh_us * NSEC_PER_USEC)
> +			return;
>  

> @@ -2348,6 +2382,7 @@ __acquires(&pool->lock)
>  	worker->current_work = work;
>  	worker->current_func = work->func;
>  	worker->current_pwq = pwq;
> +	worker->current_at = worker->task->se.sum_exec_runtime;

That only gets updated at scheduling events, it's not a 'running' clock.

>  	work_data = *work_data_bits(work);
>  	worker->current_color = get_work_color(work_data);
>  


Anyway, it occurs to me that if all you want is to measure long running
works, then would it not be much easier to simply forward the tick?

Something like the below.. Then this tick handler (which will have just
updated ->sum_exec_runtime BTW) can do that above 'work-be-long-running'
check.

Or am I missing something? Seems simpler than hijacking preempt-out.

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 944c3ae39861..3484cada9a4a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5632,6 +5632,9 @@ void scheduler_tick(void)
 
 	perf_event_task_tick();
 
+	if (curr->flags & PF_WQ_WORKER)
+		wq_worker_tick(curr);
+
 #ifdef CONFIG_SMP
 	rq->idle_balance = idle_cpu(cpu);
 	trigger_load_balance(rq);