lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 01 Apr 2020 15:28:48 +0530
From:   "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>
To:     "Gautham R. Shenoy" <ego@...ux.vnet.ibm.com>,
        Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>,
        Michael Ellerman <mpe@...erman.id.au>,
        Nathan Lynch <nathanl@...ux.ibm.com>,
        Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>,
        Tyrel Datwyler <tyreld@...ux.ibm.com>
Cc:     linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading
 [idle_][s]purr

Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego@...ux.vnet.ibm.com>
> 
> Currently purr, spurr, idle_purr, idle_spurr are exposed for every CPU
> via the sysfs interface
> /sys/devices/system/cpu/cpuX/[idle_][s]purr. Each sysfs read currently
> generates an IPI to obtain the desired value from the target CPU X.
> Since these aforementioned sysfs are typically read one after another,
> we end up generating 4 IPIs per CPU in a short duration.
> 
> In order to minimize the IPI noise, this patch caches the values of
> all the four entities whenever one of them is read. If subsequently
> any of these are read within the next 10ms, the cached value is
> returned. With this, we will generate at most one IPI every 10ms for
> every CPU.
> 
> Test-results: While reading the four sysfs files back-to-back for a
> given CPU every second for 100 seconds.
> 
> Without the patch:
> 		 16 [XICS 2 Edge IPI] = 422 times
> 		 DBL [Doorbell interrupts] = 13 times
> 		 Total : 435 IPIs.
> 
> With the patch:
> 		  16 [XICS 2 Edge IPI] = 111 times
> 		  DBL [Doorbell interrupts] = 17 times
> 		  Total : 128 IPIs.
> 
> Signed-off-by: Gautham R. Shenoy <ego@...ux.vnet.ibm.com>
> ---
>  arch/powerpc/kernel/sysfs.c | 117 ++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 97 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
> index 571b325..bd92023 100644
> --- a/arch/powerpc/kernel/sysfs.c
> +++ b/arch/powerpc/kernel/sysfs.c
> @@ -586,8 +586,6 @@ void ppc_enable_pmcs(void)
>   * SPRs which are not related to PMU.
>   */
>  #ifdef CONFIG_PPC64
> -SYSFS_SPRSETUP(purr, SPRN_PURR);
> -SYSFS_SPRSETUP(spurr, SPRN_SPURR);
>  SYSFS_SPRSETUP(pir, SPRN_PIR);
>  SYSFS_SPRSETUP(tscr, SPRN_TSCR);
> 
> @@ -596,8 +594,6 @@ void ppc_enable_pmcs(void)
>    enable write when needed with a separate function.
>    Lets be conservative and default to pseries.
>  */
> -static DEVICE_ATTR(spurr, 0400, show_spurr, NULL);
> -static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
>  static DEVICE_ATTR(pir, 0400, show_pir, NULL);
>  static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr);
>  #endif /* CONFIG_PPC64 */
> @@ -761,22 +757,110 @@ static void create_svm_file(void)
>  }
>  #endif /* CONFIG_PPC_SVM */
> 
> +#ifdef CONFIG_PPC64
> +/*
> + * The duration (in ms) from the last IPI to the target CPU until
> + * which a cached value of purr, spurr, idle_purr, idle_spurr can be
> + * reported to the user on a corresponding sysfs file read. Beyond
> + * this duration, fresh values need to be obtained by sending IPIs to
> + * the target CPU when the sysfs files are read.
> + */
> +static unsigned long util_stats_staleness_tolerance_ms = 10;

This is a nice optimization for our use in lparstat, though I have a 
concern below.

> +struct util_acct_stats {
> +	u64 latest_purr;
> +	u64 latest_spurr;
> +#ifdef CONFIG_PPC_PSERIES
> +	u64 latest_idle_purr;
> +	u64 latest_idle_spurr;
> +#endif

You can probably drop the 'latest_' prefix.

> +	unsigned long last_update_jiffies;
> +};
> +
> +DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats);

Per snowpatch, this should be static, and so should get_util_stats_ptr() 
below:
https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/16601//artifact/linux/report.txt

> +
> +static void update_util_acct_stats(void *ptr)
> +{
> +	struct util_acct_stats *stats = ptr;
> +
> +	stats->latest_purr = mfspr(SPRN_PURR);
> +	stats->latest_spurr = mfspr(SPRN_SPURR);
>  #ifdef CONFIG_PPC_PSERIES
> -static void read_idle_purr(void *val)
> +	stats->latest_idle_purr = read_this_idle_purr();
> +	stats->latest_idle_spurr = read_this_idle_spurr();
> +#endif
> +	stats->last_update_jiffies = jiffies;
> +}
> +
> +struct util_acct_stats *get_util_stats_ptr(int cpu)
> +{
> +	struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu);
> +	unsigned long delta_jiffies;
> +
> +	delta_jiffies = jiffies - stats->last_update_jiffies;
> +
> +	/*
> +	 * If we have a recent enough data, reuse that instead of
> +	 * sending an IPI.
> +	 */
> +	if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms)
> +		return stats;
> +
> +	smp_call_function_single(cpu, update_util_acct_stats, stats, 1);
> +	return stats;
> +}
> +
> +static ssize_t show_purr(struct device *dev,
> +			 struct device_attribute *attr, char *buf)
>  {
> -	u64 *ret = val;
> +	struct cpu *cpu = container_of(dev, struct cpu, dev);
> +	struct util_acct_stats *stats;
> 
> -	*ret = read_this_idle_purr();
> +	stats = get_util_stats_ptr(cpu->dev.id);
> +	return sprintf(buf, "%llx\n", stats->latest_purr);

This alters the behavior of the current sysfs purr file. I am not sure 
if it is reasonable to return the same PURR value across a 10ms window.

I wonder if we should introduce a sysctl interface to control 
thresholding. It can default to 0, which disables thresholding so that 
the existing behavior continues. Applications (lparstat) can optionally 
set it to suit their use.

- Naveen

Powered by blists - more mailing lists