linux-kernel - Re: [PATCH v3 10/10] sched/psi: per-cgroup PSI accounting disable/re-enable interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YwX2jC2UQ/zeY2E8@cmpxchg.org>
Date:   Wed, 24 Aug 2022 05:59:40 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Chengming Zhou <zhouchengming@...edance.com>
Cc:     tj@...nel.org, mkoutny@...e.com, surenb@...gle.com,
        gregkh@...uxfoundation.org, corbet@....net, mingo@...hat.com,
        peterz@...radead.org, songmuchun@...edance.com,
        cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 10/10] sched/psi: per-cgroup PSI accounting
 disable/re-enable interface

Hi Chengming,

Thanks for incorporating all the feedback. I have a few nitpicks
below, but with those considered, please add:

Acked-by: Johannes Weiner <hannes@...xchg.org>

On Wed, Aug 24, 2022 at 04:18:29PM +0800, Chengming Zhou wrote:
> @@ -5171,12 +5220,19 @@ static struct cftype cgroup_base_files[] = {
>  	{
>  		.name = "irq.pressure",
>  		.flags = CFTYPE_PRESSURE,
> +		.file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]),
>  		.seq_show = cgroup_irq_pressure_show,
>  		.write = cgroup_irq_pressure_write,
>  		.poll = cgroup_pressure_poll,
>  		.release = cgroup_pressure_release,
>  	},
>  #endif
> +	{
> +		.name = "cgroup.pressure",
> +		.flags = CFTYPE_PRESSURE,
> +		.seq_show = cgroup_psi_show,
> +		.write = cgroup_psi_write,

To match the naming convention, these should be called
cgroup_pressure_show() and cgroup_pressure_write().

> @@ -745,6 +745,14 @@ static void psi_group_change(struct psi_group *group, int cpu,
>  		if (set & (1 << t))
>  			groupc->tasks[t]++;
>  
> +	if (!group->enabled) {
> +		if (groupc->state_mask & (1 << PSI_NONIDLE))
> +			record_times(groupc, now);

Thanks for the explanation in the other thread, it made sense. But can
you please add a comment to document it? Something like:

	/*
	 * On the first group change after disabling PSI, conclude
	 * the current state and flush its time. This is unlikely
	 * to matter to the user, but aggregation (get_recent_times)
	 * may have already incorporated the live state into times_prev;
	 * avoid a delta sample underflow when PSI is later re-enabled.
	 */

An unlikely() would also make sense on that branch.

> @@ -1081,6 +1092,40 @@ void cgroup_move_task(struct task_struct *task, struct css_set *to)
>  
>  	task_rq_unlock(rq, task, &rf);
>  }
> +
> +void psi_cgroup_enabled_sync(struct psi_group *group)
> +{
> +	int cpu;
> +
> +	/*
> +	 * After we disable psi_group->enabled, we don't actually
> +	 * stop percpu tasks accounting in each psi_group_cpu,
> +	 * instead only stop test_state() loop, record_times()
> +	 * and averaging worker, see psi_group_change() for details.
> +	 *
> +	 * When disable cgroup PSI, this function has nothing to sync
> +	 * since cgroup pressure files are hidden and percpu psi_group_cpu
> +	 * would see !psi_group->enabled and only do task accounting.
> +	 *
> +	 * When re-enable cgroup PSI, this function use psi_group_change()
> +	 * to get correct state mask from test_state() loop on tasks[],
> +	 * and restart groupc->state_start from now, use .clear = .set = 0
> +	 * here since no task status really changed.
> +	 */
> +	if (!group->enabled)
> +		return;

Thanks for adding the comment, that's helpful.

I think the function would be a tad clearer and self-documenting if
you called it psi_cgroup_restart(), and only call it on enabling.

> +	for_each_possible_cpu(cpu) {
> +		struct rq *rq = cpu_rq(cpu);
> +		struct rq_flags rf;
> +		u64 now;
> +
> +		rq_lock_irq(rq, &rf);
> +		now = cpu_clock(cpu);
> +		psi_group_change(group, cpu, 0, 0, now, true);
> +		rq_unlock_irq(rq, &rf);
> +	}
> +}
>  #endif /* CONFIG_CGROUPS */

Thanks,
Johannes