linux-kernel - Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.20.1611260947390.3602@nanos>
Date:   Sat, 26 Nov 2016 10:08:57 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Fenghua Yu <fenghua.yu@...el.com>
cc:     "H. Peter Anvin" <h.peter.anvin@...el.com>,
        Ingo Molnar <mingo@...e.hu>, Tony Luck <tony.luck@...el.com>,
        Ravi V Shankar <ravi.v.shankar@...el.com>,
        Sai Prakhya <sai.praneeth.prakhya@...el.com>,
        Vikas Shivappa <vikas.shivappa@...ux.intel.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 2/2] x86/intel_rdt: Update task closid immediately on
 CPU in rmdir and unmount

On Fri, 25 Nov 2016, Fenghua Yu wrote:
> On Wed, Nov 23, 2016 at 03:23:50PM +0100, Thomas Gleixner wrote:
> > +#ifdef CONFIG_SMP
> > +			/*
> > +			 * This is safe on x86 w/o barriers as the ordering
> > +			 * of writing to task_cpu() and t->on_cpu is
> > +			 * reverse to the reading here. The detection is
> > +			 * inaccurate as tasks might move or schedule
> > +			 * before the smp function call takes place. In
> > +			 * such a case the function call is pointless, but
> > +			 * there is no other side effect.
> > +			 */
> 
> If process p1 is running on CPU1 before this point,
> 
> > +			if (mask && t->on_cpu)
> > +				cpumask_set_cpu(task_cpu(t), mask);
> 
> If between CPU1 is set in mask and rdt_update_closid(tmpmask, NULL) is
> called, p1 is switched to CPU2, and process p2 with its own closid
> (e.g. 2) is switched to CPU1.
> 
> Then closid in PQR_ASSOC is set incorrectly as 0 instead of 2 on CPU1.
> 0 may stay in PQR_ASSOC until next context switch which may take long time
> in cases of real time or HPC.
> 
> Don't we need to care this situation? In this situation, the function call
> is not "pointless" but it's wrong, right?

No.

CPU0		CPU1			CPU2
		T1 (closid 0)		T2 (closid 2)

(t1->on_cpu)
  set(1, mask)
		preemption
		   T1 ->CPU2
		switch_to(T3)		preemption
		switch_to(idle)
					T2 -> CPU1
		switch_to(T2)		switch_to(T1)
		 intel_rdt_sched_in()	 intel_rdt_sched_in()
		  closid = T2->closid	 closid = T1->closid
		  closid =2 	  	 closid = CPU2->closid
				 				 
rdt_update_closid(mask)
		
		rdt_update_cpu_closid()
		  intel_rdt_sched_in()
		   closid = T2->closid
		   closid = 2

IOW, whatever comes first, sched_switch() or function call will update the
closid to the correct value.

If CPU2 was in the removed group then this looks the following way:

CPU0		CPU1			CPU2
		T1 (closid 0)		T2 (closid 2)

(t1->on_cpu)
  set(1, mask)
		preemption
		   T1 ->CPU2
		switch_to(T3)		preemption
		switch_to(idle)
					T2 -> CPU1
		switch_to(T2)		switch_to(T1)
		 intel_rdt_sched_in()	 intel_rdt_sched_in()
		   closid = T2->closid	 closid = T1->closid (0)
		  closid =2 	  	 closid = CPU2->closid
		  	 		 closid = 5
for_each_cpu(grp->mask)
	CPU2->closid = 0
		 
rdt_update_closid(mask)
		
		rdt_update_cpu_closid()	rdt_update_cpu_closid()
		  intel_rdt_sched_in(	 intel_rdt_sched_in()
		   closid = T2->closid	  closid = T1->closid (0)
		   closid = 2		  closid = CPU2->closid
		   	    		  closid = 0

But on CPU2 the function call might be pointless as well in the following
situation:

CPU0		CPU1			CPU2
		T1 (closid 0)		T2 (closid 2)

(t1->on_cpu)
  set(1, mask)
		preemption
		   T1 ->CPU2
		switch_to(T3)		preemption
		switch_to(idle)

for_each_cpu(grp->mask)
	CPU2->closid = 0
					T2 -> CPU1
		switch_to(T2)		switch_to(T1)
		 intel_rdt_sched_in()	 intel_rdt_sched_in()
		   closid = T2->closid	 closid = T1->closid (0)
		  closid =2 	  	 closid = CPU2->closid
		  	 		 closid = 0
		 
rdt_update_closid(mask)
		
		rdt_update_cpu_closid()	rdt_update_cpu_closid()
		  intel_rdt_sched_in(	 intel_rdt_sched_in()
		   closid = T2->closid	  closid = T1->closid (0)
		   closid = 2		  closid = CPU2->closid
		   	    		  closid = 0

The whole thing works by ordering:

1) Update closids of each task in the group and if a task is running on a
   cpu then mark the CPU on which the task is running for the function
   call.

2) Update closids of each CPU in the group

3) Or the cpumasks of the tasks and the groups and invoke the function call
   on all of them

If an affected task does a sched_switch after task->closid is updated and
before the function call is invoked then the function call is pointless.

If a sched switch happens on a CPU after cpu->closid is updated and before
the function call is invoked then the function call is pointless.

But there is no case where the function call can result in a wrong value.

Thanks,

	tglx