linux-kernel - Re: [PATCH v2 1/1] x86/resctrl: fix task closid/rmid update race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f191bff5-b868-e008-376b-7b0a316dc5be@arm.com>
Date:   Fri, 11 Nov 2022 18:38:37 +0000
From:   James Morse <james.morse@....com>
To:     Peter Newman <peternewman@...gle.com>, reinette.chatre@...el.com,
        fenghua.yu@...el.com
Cc:     bp@...en8.de, derkling@...gle.com, eranian@...gle.com,
        hpa@...or.com, jannh@...gle.com, kpsingh@...gle.com,
        linux-kernel@...r.kernel.org, mingo@...hat.com, tglx@...utronix.de,
        x86@...nel.org
Subject: Re: [PATCH v2 1/1] x86/resctrl: fix task closid/rmid update race

Hi Peter,

On 10/11/2022 13:53, Peter Newman wrote:
> When determining whether running tasks need to be interrupted due to a
> closid/rmid change, it was possible for the task in question to migrate
> or wake up concurrently without observing the updated values.
> 
> This was because stores updating the closid and rmid in the task
> structure could reorder with the loads in task_curr() and task_cpu().

Mentioning this happens in __rdtgroup_move_task() would make this easier to review.


> Similar reordering also impacted resctrl_sched_in(), where reading the
> updated values could reorder with prior stores to t->on_cpu.

Where does restrl_sched_in() depend on t->on_cpu?


> Instead, when moving a single task, use task_call_func() to serialize
> updates to the closid and rmid fields in the task_struct with context
> switch.
> 
> When deleting a group, just update the MSRs on all CPUs rather than
> calling task_call_func() on every task in a potentially long list while
> read-locking the tasklist_lock.

This rmdir stuff feels like something that should go in a preparatory patch with an
expanded justification. (the stuff in the comment below). Real-time users may care about
unconditionally IPIing all CPUs, but I suspect changes to resctrl while the system is
running aren't realistic.

A group of smaller patches that make independent changes is easier to review than one big
one! (especially as some of those changes are mechanical)


> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index e5a48f05e787..d645f9a6c22e 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -538,12 +538,38 @@ static void _update_task_closid_rmid(void *task)
>  		resctrl_sched_in();
>  }
>  
> -static void update_task_closid_rmid(struct task_struct *t)
> +static int update_locked_task_closid_rmid(struct task_struct *t, void *arg)
>  {
> -	if (IS_ENABLED(CONFIG_SMP) && task_curr(t))
> -		smp_call_function_single(task_cpu(t), _update_task_closid_rmid, t, 1);
> -	else
> -		_update_task_closid_rmid(t);

[...]

>  static int __rdtgroup_move_task(struct task_struct *tsk,
> @@ -557,39 +583,26 @@ static int __rdtgroup_move_task(struct task_struct *tsk,

> -	update_task_closid_rmid(tsk);
> +	if (update_task_closid_rmid(tsk, rdtgrp) && IS_ENABLED(CONFIG_SMP))
> +		/*
> +		 * If the task has migrated away from the CPU indicated by
> +		 * task_cpu() below, then it has already switched in on the
> +		 * new CPU using the updated closid and rmid and the call below
> +		 * unnecessary, but harmless.
> +		 */
> +		smp_call_function_single(task_cpu(tsk),
> +					 _update_task_closid_rmid, tsk, 1);
> +	else
> +		_update_task_closid_rmid(tsk);

I think it would result in less churn if you kept this chunk in update_task_closid_rmid().


>  	return 0;
>  }
> @@ -2385,12 +2398,13 @@ static int reset_all_ctrls(struct rdt_resource *r)
>   * Move tasks from one to the other group. If @from is NULL, then all tasks
>   * in the systems are moved unconditionally (used for teardown).
>   *
> - * If @mask is not NULL the cpus on which moved tasks are running are set
> - * in that mask so the update smp function call is restricted to affected
> - * cpus.
> + * Following this operation, the caller is required to update the MSRs on all
> + * CPUs. The cost of constructing the precise mask of CPUs impacted by this
> + * operation will likely be high, during which we'll be blocking writes to the
> + * tasklist, and in non-trivial cases, the resulting mask would contain most of
> + * the CPUs anyways.

This is the argument for not building the mask. I think it would be better placed in the
commit message of a patch that removes that support. It's not really necessary for new
users to read about what the function doesn't do....


With the caveat that I don't understand memory ordering:
Reviewed-by: James Morse <james.morse@....com>


Thanks,

James