lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 01 Apr 2020 15:28:48 +0530 From: "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com> To: "Gautham R. Shenoy" <ego@...ux.vnet.ibm.com>, Kamalesh Babulal <kamalesh@...ux.vnet.ibm.com>, Michael Ellerman <mpe@...erman.id.au>, Nathan Lynch <nathanl@...ux.ibm.com>, Vaidyanathan Srinivasan <svaidy@...ux.vnet.ibm.com>, Tyrel Datwyler <tyreld@...ux.ibm.com> Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org Subject: Re: [PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr Gautham R. Shenoy wrote: > From: "Gautham R. Shenoy" <ego@...ux.vnet.ibm.com> > > Currently purr, spurr, idle_purr, idle_spurr are exposed for every CPU > via the sysfs interface > /sys/devices/system/cpu/cpuX/[idle_][s]purr. Each sysfs read currently > generates an IPI to obtain the desired value from the target CPU X. > Since these aforementioned sysfs are typically read one after another, > we end up generating 4 IPIs per CPU in a short duration. > > In order to minimize the IPI noise, this patch caches the values of > all the four entities whenever one of them is read. If subsequently > any of these are read within the next 10ms, the cached value is > returned. With this, we will generate at most one IPI every 10ms for > every CPU. > > Test-results: While reading the four sysfs files back-to-back for a > given CPU every second for 100 seconds. > > Without the patch: > 16 [XICS 2 Edge IPI] = 422 times > DBL [Doorbell interrupts] = 13 times > Total : 435 IPIs. > > With the patch: > 16 [XICS 2 Edge IPI] = 111 times > DBL [Doorbell interrupts] = 17 times > Total : 128 IPIs. > > Signed-off-by: Gautham R. Shenoy <ego@...ux.vnet.ibm.com> > --- > arch/powerpc/kernel/sysfs.c | 117 ++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 97 insertions(+), 20 deletions(-) > > diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c > index 571b325..bd92023 100644 > --- a/arch/powerpc/kernel/sysfs.c > +++ b/arch/powerpc/kernel/sysfs.c > @@ -586,8 +586,6 @@ void ppc_enable_pmcs(void) > * SPRs which are not related to PMU. > */ > #ifdef CONFIG_PPC64 > -SYSFS_SPRSETUP(purr, SPRN_PURR); > -SYSFS_SPRSETUP(spurr, SPRN_SPURR); > SYSFS_SPRSETUP(pir, SPRN_PIR); > SYSFS_SPRSETUP(tscr, SPRN_TSCR); > > @@ -596,8 +594,6 @@ void ppc_enable_pmcs(void) > enable write when needed with a separate function. > Lets be conservative and default to pseries. > */ > -static DEVICE_ATTR(spurr, 0400, show_spurr, NULL); > -static DEVICE_ATTR(purr, 0400, show_purr, store_purr); > static DEVICE_ATTR(pir, 0400, show_pir, NULL); > static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr); > #endif /* CONFIG_PPC64 */ > @@ -761,22 +757,110 @@ static void create_svm_file(void) > } > #endif /* CONFIG_PPC_SVM */ > > +#ifdef CONFIG_PPC64 > +/* > + * The duration (in ms) from the last IPI to the target CPU until > + * which a cached value of purr, spurr, idle_purr, idle_spurr can be > + * reported to the user on a corresponding sysfs file read. Beyond > + * this duration, fresh values need to be obtained by sending IPIs to > + * the target CPU when the sysfs files are read. > + */ > +static unsigned long util_stats_staleness_tolerance_ms = 10; This is a nice optimization for our use in lparstat, though I have a concern below. > +struct util_acct_stats { > + u64 latest_purr; > + u64 latest_spurr; > +#ifdef CONFIG_PPC_PSERIES > + u64 latest_idle_purr; > + u64 latest_idle_spurr; > +#endif You can probably drop the 'latest_' prefix. > + unsigned long last_update_jiffies; > +}; > + > +DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats); Per snowpatch, this should be static, and so should get_util_stats_ptr() below: https://openpower.xyz/job/snowpatch/job/snowpatch-linux-sparse/16601//artifact/linux/report.txt > + > +static void update_util_acct_stats(void *ptr) > +{ > + struct util_acct_stats *stats = ptr; > + > + stats->latest_purr = mfspr(SPRN_PURR); > + stats->latest_spurr = mfspr(SPRN_SPURR); > #ifdef CONFIG_PPC_PSERIES > -static void read_idle_purr(void *val) > + stats->latest_idle_purr = read_this_idle_purr(); > + stats->latest_idle_spurr = read_this_idle_spurr(); > +#endif > + stats->last_update_jiffies = jiffies; > +} > + > +struct util_acct_stats *get_util_stats_ptr(int cpu) > +{ > + struct util_acct_stats *stats = per_cpu_ptr(&util_acct_stats, cpu); > + unsigned long delta_jiffies; > + > + delta_jiffies = jiffies - stats->last_update_jiffies; > + > + /* > + * If we have a recent enough data, reuse that instead of > + * sending an IPI. > + */ > + if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms) > + return stats; > + > + smp_call_function_single(cpu, update_util_acct_stats, stats, 1); > + return stats; > +} > + > +static ssize_t show_purr(struct device *dev, > + struct device_attribute *attr, char *buf) > { > - u64 *ret = val; > + struct cpu *cpu = container_of(dev, struct cpu, dev); > + struct util_acct_stats *stats; > > - *ret = read_this_idle_purr(); > + stats = get_util_stats_ptr(cpu->dev.id); > + return sprintf(buf, "%llx\n", stats->latest_purr); This alters the behavior of the current sysfs purr file. I am not sure if it is reasonable to return the same PURR value across a 10ms window. I wonder if we should introduce a sysctl interface to control thresholding. It can default to 0, which disables thresholding so that the existing behavior continues. Applications (lparstat) can optionally set it to suit their use. - Naveen
Powered by blists - more mailing lists