linux-kernel - Re: [RFC PATCH 0/2] sched: simplify the select_task_rq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1358761496.4994.118.camel@marge.simpson.net>
Date:	Mon, 21 Jan 2013 10:44:56 +0100
From:	Mike Galbraith <bitbucket@...ine.de>
To:	Michael Wang <wangyun@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...hat.com,
	peterz@...radead.org, mingo@...nel.org, a.p.zijlstra@...llo.nl
Subject: Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

On Mon, 2013-01-21 at 17:22 +0800, Michael Wang wrote: 
> On 01/21/2013 05:09 PM, Mike Galbraith wrote:
> > On Mon, 2013-01-21 at 15:45 +0800, Michael Wang wrote: 
> >> On 01/21/2013 03:09 PM, Mike Galbraith wrote:
> >>> On Mon, 2013-01-21 at 07:42 +0100, Mike Galbraith wrote: 
> >>>> On Mon, 2013-01-21 at 13:07 +0800, Michael Wang wrote:
> >>>
> >>>>> May be we could try change this back to the old way later, after the aim
> >>>>> 7 test on my server.
> >>>>
> >>>> Yeah, something funny is going on.
> >>>
> >>> Never entering balance path kills the collapse.  Asking wake_affine()
> >>> wrt the pull as before, but allowing us to continue should no idle cpu
> >>> be found, still collapsed.  So the source of funny behavior is indeed in
> >>> balance_path.
> >>
> >> Below patch based on the patch set could help to avoid enter balance path
> >> if affine_sd could be found, just like the old logical, would you like to
> >> take a try and see whether it could help fix the collapse?
> > 
> > No, it does not.
> 
> Hmm...what have changed now compared to the old logical?

What I did earlier to confirm the collapse originates in balance_path is
below.  I just retested to confirm.

Tasks    jobs/min  jti  jobs/min/task      real       cpu
    1      435.34  100       435.3448     13.92      3.76   Mon Jan 21 10:24:00 2013
    1      440.09  100       440.0871     13.77      3.76   Mon Jan 21 10:24:22 2013
    1      440.41  100       440.4070     13.76      3.75   Mon Jan 21 10:24:45 2013
    5     2467.43   99       493.4853     12.28     10.71   Mon Jan 21 10:24:59 2013
    5     2445.52   99       489.1041     12.39     10.98   Mon Jan 21 10:25:14 2013
    5     2475.49   99       495.0980     12.24     10.59   Mon Jan 21 10:25:27 2013
   10     4963.14   99       496.3145     12.21     20.64   Mon Jan 21 10:25:41 2013
   10     4959.08   99       495.9083     12.22     21.26   Mon Jan 21 10:25:54 2013
   10     5415.55   99       541.5550     11.19     11.54   Mon Jan 21 10:26:06 2013
   20     9934.43   96       496.7213     12.20     33.52   Mon Jan 21 10:26:18 2013
   20     9950.74   98       497.5369     12.18     36.52   Mon Jan 21 10:26:31 2013
   20     9893.88   96       494.6939     12.25     34.39   Mon Jan 21 10:26:43 2013
   40    18937.50   98       473.4375     12.80     84.74   Mon Jan 21 10:26:56 2013
   40    18996.87   98       474.9216     12.76     88.64   Mon Jan 21 10:27:09 2013
   40    19146.92   98       478.6730     12.66     89.98   Mon Jan 21 10:27:22 2013
   80    37610.55   98       470.1319     12.89    112.01   Mon Jan 21 10:27:35 2013
   80    37321.02   98       466.5127     12.99    114.21   Mon Jan 21 10:27:48 2013
   80    37610.55   98       470.1319     12.89    111.77   Mon Jan 21 10:28:01 2013
  160    69109.05   98       431.9316     14.03    156.81   Mon Jan 21 10:28:15 2013
  160    69505.38   98       434.4086     13.95    155.33   Mon Jan 21 10:28:29 2013
  160    69207.71   98       432.5482     14.01    155.79   Mon Jan 21 10:28:43 2013
  320   108033.43   98       337.6045     17.95    314.01   Mon Jan 21 10:29:01 2013
  320   108577.83   98       339.3057     17.86    311.79   Mon Jan 21 10:29:19 2013
  320   108395.75   98       338.7367     17.89    312.55   Mon Jan 21 10:29:37 2013
  640   151440.84   98       236.6263     25.61    620.37   Mon Jan 21 10:30:03 2013
  640   151440.84   97       236.6263     25.61    621.23   Mon Jan 21 10:30:29 2013
  640   151145.75   98       236.1652     25.66    622.35   Mon Jan 21 10:30:55 2013
 1280   190117.65   98       148.5294     40.80   1228.40   Mon Jan 21 10:31:36 2013
 1280   189977.96   98       148.4203     40.83   1229.91   Mon Jan 21 10:32:17 2013
 1280   189560.12   98       148.0938     40.92   1231.71   Mon Jan 21 10:32:58 2013
 2560   217857.04   98        85.1004     71.21   2441.61   Mon Jan 21 10:34:09 2013
 2560   217338.19   98        84.8977     71.38   2448.76   Mon Jan 21 10:35:21 2013
 2560   217795.87   97        85.0765     71.23   2443.12   Mon Jan 21 10:36:32 2013

That was with your change backed out, and the q/d below applied.

---
 kernel/sched/fair.c |   27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3337,6 +3337,8 @@ select_task_rq_fair(struct task_struct *
 		goto unlock;
 
 	if (sd_flag & SD_BALANCE_WAKE) {
+		new_cpu = prev_cpu;
+
 		/*
 		 * Tasks to be waked is special, memory it relied on
 		 * may has already been cached on prev_cpu, and usually
@@ -3348,33 +3350,16 @@ select_task_rq_fair(struct task_struct *
 		 * from top to bottom, which help to reduce the chance in
 		 * some cases.
 		 */
-		new_cpu = select_idle_sibling(p, prev_cpu);
+		new_cpu = select_idle_sibling(p, new_cpu);
 		if (idle_cpu(new_cpu))
 			goto unlock;
 
-		/*
-		 * No idle cpu could be found in the topology of prev_cpu,
-		 * before jump into the slow balance_path, try search again
-		 * in the topology of current cpu if it is the affine of
-		 * prev_cpu.
-		 */
-		if (!sbm->affine_map[prev_cpu] ||
-				!cpumask_test_cpu(cpu, tsk_cpus_allowed(p)))
-			goto balance_path;
-
-		new_cpu = select_idle_sibling(p, cpu);
-		if (!idle_cpu(new_cpu))
-			goto balance_path;
+		if (wake_affine(sbm->affine_map[cpu], p, sync))
+			new_cpu = select_idle_sibling(p, cpu);
 
-		/*
-		 * Invoke wake_affine() finally since it is no doubt a
-		 * performance killer.
-		 */
-		if (wake_affine(sbm->affine_map[prev_cpu], p, sync))
-			goto unlock;
+		goto unlock;
 	}
 
-balance_path:
 	new_cpu = (sd_flag & SD_BALANCE_WAKE) ? prev_cpu : cpu;
 	sd = sbm->sd[type][sbm->top_level[type]];
 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/