lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 21 May 2012 17:21:59 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Vlad Zolotarov <vlad@...lemp.com>
Cc:	"Shai Fultheim (Shai@...leMP.com)" <Shai@...lemp.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Ido Yariv <ido@...ery.com>
Subject: Re: [PATCH v3 0/2] Move x86_cpu_to_apicid to the __read_mostly
 section


* Vlad Zolotarov <vlad@...lemp.com> wrote:

> On Monday, May 21, 2012 16:08:22 Ingo Molnar wrote:
> > * Vlad Zolotarov <vlad@...lemp.com> wrote:
> > > On Monday, May 21, 2012 02:32:46 PM Ingo Molnar wrote:
> > > > * Shai Fultheim (Shai@...leMP.com) <Shai@...leMP.com> wrote:
> > > > > Ingo,
> > > > > 
> > > > > The reason for this, as you pointed out, is the 'cache line'
> > > > > size (4096 bytes).  We see significant false sharing is we do
> > > > > not move this next to each other.
> > > > 
> > > > Which write-often variable caused the many cache flushes/fills?
> > > > cpu_to_apicid is read mostly.
> > > > 
> > > > I.e. it might make more sense to identify the frequenty
> > > > *modified* percpu variables, and move them to a separate
> > > > section. I *think* most percpu variables are read mostly, so
> > > > it would be more maintainable in the long run to figure out
> > > > the frequently modified ones, not the frequently not
> > > > modified ones.
> > > 
> > > I tend to disagree about the general claim that most per-CPU
> > > variables are read-mostly: consider the per-CPU data
> > > structures used in lock-less algorithms like softnet_data used
> > > in a NAPI. I'm not sure what is a more common - read- only or
> > > not-read-only per-cpu data, but surely there are both...
> > 
> > Well, a quick tally of percpu variables on a 'make defconfig'
> > kernel would tell us one way or another?
> > 
> > Here there's almost 200 percpu variables active in the 64-bit
> > x86 defconfig, and a quick random sample suggests that most are
> > read-mostly.
> > 
> > I have no fundamental prefer to either approach, but the
> > direction taken should be justified explicitly, with numbers,
> > arguments, etc. - also a short blurb somewhere in the headers
> > that explains when they should be used, so that others can be
> > aware of vSMP's special needs here.
> 
> There must be some misunderstanding - this patch is not a vSMP 
> Foundation specific as it defines read-mostly variables as 
> __read_mostly. The motivation for it is just the same as in a 
> non-vSMP Foundation case. It's true that the performance gain 
> this patch introduces in the vSMP Foundation is likely to be 
> more significant than in a native Linux, however even for a 
> native Linux it would still be a better code as __read_mostly 
> is not a vSMP Foundation specific paradigm and, again, the 
> variables modified are a clear read-mostly case.

(Could we please use 'vSMP' as a shortcut?)

I know that it's not vSMP specific - but the gains are largely 
concentrated on the vSMP side and in fact I suspect that they 
are important performance fixes for vSMP, while only 'nice to 
have' micro-optimizations on other systems, right?

As such it's useful to outline the justification and relevance 
of the patch.

> So, the explanation u request above would be just the same as 
> if I would explain when in general __read_mostly should be 
> used.
> 
> I grep'ed the Documentation and haven't found any readme file 
> with the explicit instructions when __read_mostly qualifier 
> should be used and u r right we'd better write one.

Furthermore, this is a read_mostly per cpu variable, which is 
even less obvious than a read_mostly global variable.

> I can create an initial version of such a doc but I think it 
> would better come as a separate patch.

Sure.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ