lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f75ddcd5-f9a0-30c5-94e4-c4077e17ffb0@arm.com>
Date:   Fri, 26 Oct 2018 19:04:06 +0100
From:   Valentin Schneider <valentin.schneider@....com>
To:     Steve Sistare <steven.sistare@...cle.com>, mingo@...hat.com,
        peterz@...radead.org
Cc:     subhra.mazumdar@...cle.com, dhaval.giani@...cle.com,
        rohit.k.jain@...cle.com, daniel.m.jordan@...cle.com,
        pavel.tatashin@...rosoft.com, matt@...eblueprint.co.uk,
        umgwanakikbuti@...il.com, riel@...hat.com, jbacik@...com,
        juri.lelli@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 07/10] sched/fair: Provide can_migrate_task_llc

Hi Steve,

On 22/10/2018 15:59, Steve Sistare wrote:
> Define a simpler version of can_migrate_task called can_migrate_task_llc
> which does not require a struct lb_env argument, and judges whether a
> migration from one CPU to another within the same LLC should be allowed.
> 
> Signed-off-by: Steve Sistare <steven.sistare@...cle.com>
> ---
>  kernel/sched/fair.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4acdd8d..6548bed 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7168,6 +7168,34 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
>  }
>  
>  /*
> + * Return true if task @p can migrate from @rq to @dst_rq in the same LLC.
> + * No need to test for co-locality, and no need to test task_hot(), as sharing
> + * LLC provides cache warmth at that level.

I was thinking that perhaps we could have scenarios where some rq's
keep stealing tasks off of each other and we end up circulating tasks 
between CPUs. Now, that would only happen if we had a handful of tasks
with a very tiny period, and I'm not familiar with (real) such hyperactive
workloads similar to those generated by hackbench where that could happen.

In short, I wonder if we should have task_hot() in there. Drawing a
parallel with load_balance(), even if load-balancing is happening between
rqs of the same LLC, we do go check task_hot(). Have you already experimented
with adding a task_hot() check in here?



I've run some iterations of hackbench (hackbench 2 process 100000) to
investigate this task bouncing, but I didn't really see any of it. That was
just a 4+4 big.LITTLE system though, I'll try to get numbers on a system
with more CPUs.

----->8-----

activations: # of task activations (task starts running)
cpu_migrations: # of activations where cpu != prev_cpu
% stats are percentiles

- STEAL:

  | stat  | cpu_migrations | activations |
  |-------+----------------+-------------|
  | count |    2005.000000 | 2005.000000 |
  | mean  |      16.244888 |  290.608479 |
  | std   |      38.963138 |  253.003528 |
  | min   |       0.000000 |    3.000000 |
  | 50%   |       3.000000 |  239.000000 |
  | 75%   |       8.000000 |  436.000000 |
  | 90%   |      45.000000 |  626.000000 |
  | 99%   |     188.960000 | 1073.000000 |
  | max   |     369.000000 | 1417.000000 |

- NO_STEAL:

  | stat  | cpu_migrations | activations |
  |-------+----------------+-------------|
  | count |    2005.000000 | 2005.000000 |
  | mean  |      15.260848 |  297.860848 |
  | std   |      46.331890 |  253.210813 |
  | min   |       0.000000 |    3.000000 |
  | 50%   |       3.000000 |  252.000000 |
  | 75%   |       7.000000 |  444.000000 |
  | 90%   |      32.600000 |  643.600000 |
  | 99%   |     214.880000 | 1127.520000 |
  | max   |     467.000000 | 1547.000000 |

----->8-----

Otherwise, my only other concern at the moment is that since stealing
doesn't care about load, we could steal a task that would cause a big
imbalance, which wouldn't have happened with a call to load_balance().

I don't think this can be triggered with a symmetrical workload like
hackbench, so I'll go explore something else.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ