linux-kernel - Re: sysbench throughput degradation in 4.13+

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170928123758.robe5ggsjf4voj7h@hirez.programming.kicks-ass.net>
Date:   Thu, 28 Sep 2017 14:37:58 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Rik van Riel <riel@...hat.com>
Cc:     Eric Farman <farman@...ux.vnet.ibm.com>,
        ????????? <jinpuwang@...il.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Christian Borntraeger <borntraeger@...ibm.com>,
        "KVM-ML (kvm@...r.kernel.org)" <kvm@...r.kernel.org>,
        vcaputo@...garu.com, Matthew Rosato <mjrosato@...ux.vnet.ibm.com>
Subject: Re: sysbench throughput degradation in 4.13+

On Wed, Sep 27, 2017 at 01:58:20PM -0400, Rik van Riel wrote:
> @@ -5359,10 +5378,14 @@ wake_affine_llc(struct sched_domain *sd, struct task_struct *p,
>  		unsigned long current_load = task_h_load(current);
>  
>  		/* in this case load hits 0 and this LLC is considered 'idle' */
> -		if (current_load > this_stats.load)
> +		if (current_load > this_stats.max_load)
> +			return true;
> +
> +		/* allow if the CPU would go idle, regardless of LLC load */
> +		if (current_load >= target_load(this_cpu, sd->wake_idx))
>  			return true;
>  
> -		this_stats.load -= current_load;
> +		this_stats.max_load -= current_load;
>  	}
>  
>  	/*
> @@ -5375,10 +5398,6 @@ wake_affine_llc(struct sched_domain *sd, struct task_struct *p,
>  	if (prev_stats.has_capacity && prev_stats.nr_running < this_stats.nr_running+1)
>  		return false;
>  
> -	/* if this cache has capacity, come here */
> -	if (this_stats.has_capacity && this_stats.nr_running+1 < prev_stats.nr_running)
> -		return true;
> -
>  	/*
>  	 * Check to see if we can move the load without causing too much
>  	 * imbalance.
> @@ -5391,8 +5410,8 @@ wake_affine_llc(struct sched_domain *sd, struct task_struct *p,
>  	prev_eff_load = 100 + (sd->imbalance_pct - 100) / 2;
>  	prev_eff_load *= this_stats.capacity;
>  
> -	this_eff_load *= this_stats.load + task_load;
> -	prev_eff_load *= prev_stats.load - task_load;
> +	this_eff_load *= this_stats.max_load + task_load;
> +	prev_eff_load *= prev_stats.min_load - task_load;
>  
>  	return this_eff_load <= prev_eff_load;
>  }

So I would really like a workload that needs this LLC/NUMA stuff.
Because I much prefer the simpler: 'on which of these two CPUs can I run
soonest' approach.