lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPY9STUzTnNwKZ9V@jlelli-thinkpadt14gen4.remote.csb>
Date: Mon, 20 Oct 2025 15:46:49 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Andrea Righi <arighi@...dia.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Joel Fernandes <joelagnelf@...dia.com>, Tejun Heo <tj@...nel.org>,
	David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>, Shuah Khan <shuah@...nel.org>,
	sched-ext@...ts.linux.dev, bpf@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 08/14] sched/deadline: Add support to remove DL server's
 bandwidth contribution

Hi!

On 17/10/25 11:25, Andrea Righi wrote:
> During switching from sched_ext to FAIR tasks and vice-versa, we need
> support for removing the bandwidth contribution of either DL server. Add
> support for the same.
> 
> Co-developed-by: Joel Fernandes <joelagnelf@...dia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> Signed-off-by: Andrea Righi <arighi@...dia.com>
> ---
>  kernel/sched/deadline.c | 31 +++++++++++++++++++++++++++++++
>  kernel/sched/sched.h    |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 3c1fd2190949e..d585be4014427 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1684,6 +1684,12 @@ int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 perio
>  		dl_rq_change_utilization(rq, dl_se, new_bw);
>  	}
>  
> +	/* Clear these so that the dl_server is reinitialized */
> +	if (new_bw == 0) {
> +		dl_se->dl_defer = 0;
> +		dl_se->dl_server = 0;
> +	}
> +
>  	dl_se->dl_runtime = runtime;
>  	dl_se->dl_deadline = period;
>  	dl_se->dl_period = period;
> @@ -1697,6 +1703,31 @@ int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 perio
>  	return retval;
>  }
>  
> +/**
> + * dl_server_remove_params - Remove bandwidth reservation for a DL server
> + * @dl_se: The DL server entity to remove bandwidth for
> + *
> + * This function removes the bandwidth reservation for a DL server entity,
> + * cleaning up all bandwidth accounting and server state.
> + *
> + * Returns: 0 on success, negative error code on failure
> + */
> +int dl_server_remove_params(struct sched_dl_entity *dl_se)
> +{
> +	if (!dl_se->dl_server)
> +		return 0; /* Already disabled */
> +
> +	/*
> +	 * First dequeue if still queued. It should not be queued since
> +	 * we call this only after the last dl_server_stop().
> +	 */
> +	if (WARN_ON_ONCE(on_dl_rq(dl_se)))
> +		dequeue_dl_entity(dl_se, DEQUEUE_SLEEP);
> +
> +	/* Remove bandwidth reservation */
> +	return dl_server_apply_params(dl_se, 0, dl_se->dl_period, false);
> +}

I am not sure this is correct wrt inactive_task_timer and
task_non_contending. I fear that removing bw immediately might break
other deadline entities guarantees (especially if one then maliciously
add/removes dl-servers quickly). I kind of think (but again not sure,
please Peter and other keep me honest :) we should be waiting for
inactive_task_timer to fire (if stopping before 0-lag) and let it clean
things up at that point (like we do for simple tasks).

You seem to have additional fixes later on in the series that might be
caused by what I describe above.

Thinking more about this I actually wonder if we need this (well it's
coming up with later patches) mechanism for automatically removing
servers based on fair vs. scx state (full/partial). If we are going to
manage dl-servers bw explicitly and separately [1], maybe we can just
leave the burden to the user (or middleware) of doing that via the
configuration interface?

Thanks,
Juri

1 - https://lore.kernel.org/lkml/aPYDhjqe99F91FTW@jlelli-thinkpadt14gen4.remote.csb/ 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ