lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 25 Jul 2016 17:31:45 +0100
From:	Will Deacon <will.deacon@....com>
To:	Fu Wei <fu.wei@...aro.org>
Cc:	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Len Brown <lenb@...nel.org>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Marc Zyngier <marc.zyngier@....com>,
	Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
	Sudeep Holla <sudeep.holla@....com>,
	Hanjun Guo <hanjun.guo@...aro.org>,
	linux-arm-kernel@...ts.infradead.org,
	Linaro ACPI Mailman List <linaro-acpi@...ts.linaro.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	ACPI Devel Maling List <linux-acpi@...r.kernel.org>,
	rruigrok@...eaurora.org, harba@...eaurora.org,
	Christopher Covington <cov@...eaurora.org>,
	Timur Tabi <timur@...eaurora.org>,
	G Gregory <graeme.gregory@...aro.org>,
	Al Stone <al.stone@...aro.org>, Jon Masters <jcm@...hat.com>,
	wei@...hat.com, Arnd Bergmann <arnd@...db.de>,
	Wim Van Sebroeck <wim@...ana.be>,
	Catalin Marinas <catalin.marinas@....com>,
	Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>,
	Leo Duran <leo.duran@....com>,
	Guenter Roeck <linux@...ck-us.net>,
	linux-watchdog@...r.kernel.org
Subject: Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to
 get 64-bit CNTVCT

On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
> On 25 July 2016 at 23:31, Will Deacon <will.deacon@....com> wrote:
> > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu.wei@...aro.org wrote:
> >> From: Fu Wei <fu.wei@...aro.org>
> >>
> >> This patch simplify arch_counter_get_cntvct_mem function by
> >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> >>
> >> Signed-off-by: Fu Wei <fu.wei@...aro.org>
> >> ---
> >>  drivers/clocksource/arm_arch_timer.c | 10 +---------
> >>  1 file changed, 1 insertion(+), 9 deletions(-)
> >>
> >> diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> >> index e6fd42d..483d2f9 100644
> >> --- a/drivers/clocksource/arm_arch_timer.c
> >> +++ b/drivers/clocksource/arm_arch_timer.c
> >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
> >>
> >>  static u64 arch_counter_get_cntvct_mem(void)
> >>  {
> >> -     u32 vct_lo, vct_hi, tmp_hi;
> >> -
> >> -     do {
> >> -             vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> >> -             vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> >> -             tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> >> -     } while (vct_hi != tmp_hi);
> >> -
> >> -     return ((u64) vct_hi << 32) | vct_lo;
> >> +     return readq(arch_counter_base + CNTVCT_LO);
> >
> > Please drop this patch. It doesn't work.
> 
> I am OK to drop this, but could you let me know why it doesn't work?
> 
> I did get some problem on Foundation model about readq, but it works on Seattle.
> I guess that is a problem of model, but not a code problem.
> So I just got confused, why readq  doesn't work,  :-)

The kernel really needs to support both of those platforms :/

For the memory-mapped counter registers, the architecture says:

  `If the implementation supports 64-bit atomic accesses, then the
   CNTV_CVAL register must be accessible as an atomic 64-bit value.'

which is borderline tautological. If we take the generous reading that
this means AArch64 CPUs can use readq (and I'm not completely
comfortable with that assertion, particularly as you say that it breaks
the model), then you still need to use readq_relaxed here to avoid a
DSB. Furthermore, what are you going to do for AArch32? readq doesn't
exist over there, and if you use the generic implementation then it's
not atomic. In which case, we end up with the current code, as well as a
readq_relaxed guarded by a questionable #ifdef that is known to break a
supported platform for an unknown performance improvement. Hardly a big
win.

Did you see any performance advantage from this? Given that you've added
a DSB, this looks to be extremely premature.

Will

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ