lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 11 Dec 2019 22:32:49 +0800
From:   "zhangyi (F)" <yi.zhang@...wei.com>
To:     <linux-kernel@...r.kernel.org>, <akpm@...ux-foundation.org>,
        <cl@...ux.com>, <gregkh@...uxfoundation.org>, <mgalbraith@...e.de>,
        <mhocko@...e.com>, <torvalds@...ux-foundation.org>,
        <umgwanakikbuti@...il.com>, <wagi@...om.org>
CC:     <stable-commits@...r.kernel.org>,
        "Wangkefeng (Maro)" <wangkefeng.wang@...wei.com>,
        Xie XiuQi <xiexiuqi@...wei.com>
Subject: Re: Patch "mm, vmstat: make quiet_vmstat lighter" has been added to
 the 4.4-stable tree

Hi, all

We find a performance degradation under lmbench af_unix[1] test case after
mergeing this patch on my x86 qemu 4.4 machine. The test result is basically
stable for each teses.

Host machine: CPU: Intel(R) Xeon(R) CPU E5-2690 v3
              CPU(s): 48
              MEM: 193047 MB

Guest machine:  CPU: QEMU Virtual CPU version 2.5+
                CPU(s): 8
                MEM: 26065 MB

  Before this patch:
  [root@...alhost ~]# lmbench-3.0-a9/bin/x86_64-linux-gnu/lat_unix -P 1
  AF_UNIX sock stream latency: 133.7073 microseconds

  After this patch:
  [root@...alhost ~]# lmbench-3.0-a9/bin/x86_64-linux-gnu/lat_unix -P 1
  AF_UNIX sock stream latency: 156.4722 microseconds

If we set task to a constant cpu, the degradation does not appear.

  Before this patch:
  [root@...alhost ~]# lmbench-3.0-a9/bin/x86_64-linux-gnu/lat_unix -P 1
  AF_UNIX sock stream latency: 17.9296 microseconds

  After this patch:
  [root@...alhost ~]# lmbench-3.0-a9/bin/x86_64-linux-gnu/lat_unix -P 1
  AF_UNIX sock stream latency: 17.7500 microseconds

We also test it on the aarch64 hi1215 machine with 8 cpu cores.

  Before this patch:
  [root@...alhost ~]# ./lat_unix -P 1
  AF_UNIX sock stream latency: 30.7 microseconds

  After this patch:
  [root@...alhost ~]# ./lat_unix -P 1
  AF_UNIX sock stream latency: 37.5 microseconds

Accessories included my reproduce config for x86 qemu. Any thoughts?

Thanks,
Yi.

[1] http://sourceforge.mirrorservice.org/l/lm/lmbench/development/lmbench-3.0-a9/

On 2019/7/30 2:02, gregkh@...uxfoundation.org wrote:
> 
> This is a note to let you know that I've just added the patch titled
> 
>     mm, vmstat: make quiet_vmstat lighter
> 
> to the 4.4-stable tree which can be found at:
>     http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> 
> The filename of the patch is:
>      mm-vmstat-make-quiet_vmstat-lighter.patch
> and it can be found in the queue-4.4 subdirectory.
> 
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable@...r.kernel.org> know about it.
> 
> 
>>>From f01f17d3705bb6081c9e5728078f64067982be36 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@...e.com>
> Date: Fri, 5 Feb 2016 15:36:24 -0800
> Subject: mm, vmstat: make quiet_vmstat lighter
> 
> From: Michal Hocko <mhocko@...e.com>
> 
> commit f01f17d3705bb6081c9e5728078f64067982be36 upstream.
> 
> Mike has reported a considerable overhead of refresh_cpu_vm_stats from
> the idle entry during pipe test:
> 
>     12.89%  [kernel]       [k] refresh_cpu_vm_stats.isra.12
>      4.75%  [kernel]       [k] __schedule
>      4.70%  [kernel]       [k] mutex_unlock
>      3.14%  [kernel]       [k] __switch_to
> 
> This is caused by commit 0eb77e988032 ("vmstat: make vmstat_updater
> deferrable again and shut down on idle") which has placed quiet_vmstat
> into cpu_idle_loop.  The main reason here seems to be that the idle
> entry has to get over all zones and perform atomic operations for each
> vmstat entry even though there might be no per cpu diffs.  This is a
> pointless overhead for _each_ idle entry.
> 
> Make sure that quiet_vmstat is as light as possible.
> 
> First of all it doesn't make any sense to do any local sync if the
> current cpu is already set in oncpu_stat_off because vmstat_update puts
> itself there only if there is nothing to do.
> 
> Then we can check need_update which should be a cheap way to check for
> potential per-cpu diffs and only then do refresh_cpu_vm_stats.
> 
> The original patch also did cancel_delayed_work which we are not doing
> here.  There are two reasons for that.  Firstly cancel_delayed_work from
> idle context will blow up on RT kernels (reported by Mike):
> 
>   CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
>   Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
>   Call Trace:
>     dump_stack+0x49/0x67
>     ___might_sleep+0xf5/0x180
>     rt_spin_lock+0x20/0x50
>     try_to_grab_pending+0x69/0x240
>     cancel_delayed_work+0x26/0xe0
>     quiet_vmstat+0x75/0xa0
>     cpu_idle_loop+0x38/0x3e0
>     cpu_startup_entry+0x13/0x20
>     start_secondary+0x114/0x140
> 
> And secondly, even on !RT kernels it might add some non trivial overhead
> which is not necessary.  Even if the vmstat worker wakes up and preempts
> idle then it will be most likely a single shot noop because the stats
> were already synced and so it would end up on the oncpu_stat_off anyway.
> We just need to teach both vmstat_shepherd and vmstat_update to stop
> scheduling the worker if there is nothing to do.
> 
> [mgalbraith@...e.de: cancel pending work of the cpu_stat_off CPU]
> Signed-off-by: Michal Hocko <mhocko@...e.com>
> Reported-by: Mike Galbraith <umgwanakikbuti@...il.com>
> Acked-by: Christoph Lameter <cl@...ux.com>
> Signed-off-by: Mike Galbraith <mgalbraith@...e.de>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
> Signed-off-by: Daniel Wagner <wagi@...om.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> 
> ---
>  mm/vmstat.c |   68 ++++++++++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 46 insertions(+), 22 deletions(-)
> 
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1395,10 +1395,15 @@ static void vmstat_update(struct work_st
>  		 * Counters were updated so we expect more updates
>  		 * to occur in the future. Keep on running the
>  		 * update worker thread.
> +		 * If we were marked on cpu_stat_off clear the flag
> +		 * so that vmstat_shepherd doesn't schedule us again.
>  		 */
> -		queue_delayed_work_on(smp_processor_id(), vmstat_wq,
> -			this_cpu_ptr(&vmstat_work),
> -			round_jiffies_relative(sysctl_stat_interval));
> +		if (!cpumask_test_and_clear_cpu(smp_processor_id(),
> +						cpu_stat_off)) {
> +			queue_delayed_work_on(smp_processor_id(), vmstat_wq,
> +				this_cpu_ptr(&vmstat_work),
> +				round_jiffies_relative(sysctl_stat_interval));
> +		}
>  	} else {
>  		/*
>  		 * We did not update any counters so the app may be in
> @@ -1416,18 +1421,6 @@ static void vmstat_update(struct work_st
>   * until the diffs stay at zero. The function is used by NOHZ and can only be
>   * invoked when tick processing is not active.
>   */
> -void quiet_vmstat(void)
> -{
> -	if (system_state != SYSTEM_RUNNING)
> -		return;
> -
> -	do {
> -		if (!cpumask_test_and_set_cpu(smp_processor_id(), cpu_stat_off))
> -			cancel_delayed_work(this_cpu_ptr(&vmstat_work));
> -
> -	} while (refresh_cpu_vm_stats(false));
> -}
> -
>  /*
>   * Check if the diffs for a certain cpu indicate that
>   * an update is needed.
> @@ -1451,6 +1444,30 @@ static bool need_update(int cpu)
>  	return false;
>  }
>  
> +void quiet_vmstat(void)
> +{
> +	if (system_state != SYSTEM_RUNNING)
> +		return;
> +
> +	/*
> +	 * If we are already in hands of the shepherd then there
> +	 * is nothing for us to do here.
> +	 */
> +	if (cpumask_test_and_set_cpu(smp_processor_id(), cpu_stat_off))
> +		return;
> +
> +	if (!need_update(smp_processor_id()))
> +		return;
> +
> +	/*
> +	 * Just refresh counters and do not care about the pending delayed
> +	 * vmstat_update. It doesn't fire that often to matter and canceling
> +	 * it would be too expensive from this path.
> +	 * vmstat_shepherd will take care about that for us.
> +	 */
> +	refresh_cpu_vm_stats(false);
> +}
> +
>  
>  /*
>   * Shepherd worker thread that checks the
> @@ -1468,18 +1485,25 @@ static void vmstat_shepherd(struct work_
>  
>  	get_online_cpus();
>  	/* Check processors whose vmstat worker threads have been disabled */
> -	for_each_cpu(cpu, cpu_stat_off)
> -		if (need_update(cpu) &&
> -			cpumask_test_and_clear_cpu(cpu, cpu_stat_off))
> -
> -			queue_delayed_work_on(cpu, vmstat_wq,
> -				&per_cpu(vmstat_work, cpu), 0);
> +	for_each_cpu(cpu, cpu_stat_off) {
> +		struct delayed_work *dw = &per_cpu(vmstat_work, cpu);
>  
> +		if (need_update(cpu)) {
> +			if (cpumask_test_and_clear_cpu(cpu, cpu_stat_off))
> +				queue_delayed_work_on(cpu, vmstat_wq, dw, 0);
> +		} else {
> +			/*
> +			 * Cancel the work if quiet_vmstat has put this
> +			 * cpu on cpu_stat_off because the work item might
> +			 * be still scheduled
> +			 */
> +			cancel_delayed_work(dw);
> +		}
> +	}
>  	put_online_cpus();
>  
>  	schedule_delayed_work(&shepherd,
>  		round_jiffies_relative(sysctl_stat_interval));
> -
>  }
>  
>  static void __init start_shepherd_timer(void)
> 
> 
> Patches currently in stable-queue which might be from mhocko@...e.com are
> 
> queue-4.4/mm-mmu_notifier-use-hlist_add_head_rcu.patch
> queue-4.4/mm-vmstat-make-quiet_vmstat-lighter.patch
> queue-4.4/vmstat-remove-bug_on-from-vmstat_update.patch
> 
> .
> 

View attachment "reproduce.config" of type "text/plain" (68087 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ