lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 26 Nov 2015 02:19:48 +0000
From:	Måns Rullgård <mans@...sr.com>
To:	Russell King - ARM Linux <linux@....linux.org.uk>
Cc:	Nicolas Pitre <nico@...xnic.net>,
	Stephen Boyd <sboyd@...eaurora.org>,
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
	linux-arm-msm@...r.kernel.org, Michal Marek <mmarek@...e.com>,
	linux-kbuild@...r.kernel.org, Arnd Bergmann <arnd@...db.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>
Subject: Re: [PATCH v2 2/2] ARM: Replace calls to __aeabi_{u}idiv with udiv/sdiv instructions

Russell King - ARM Linux <linux@....linux.org.uk> writes:

> On Thu, Nov 26, 2015 at 12:50:08AM +0000, Måns Rullgård wrote:
>> If not calling the function saves an I-cache miss, the benefit can be
>> substantial.  No, I have no proof of this being a problem, but it's
>> something that could happen.
>
> That's a simplistic view of modern CPUs.
>
> As I've already said, modern CPUs which have branch prediction, but
> they also have speculative instruction fetching and speculative data
> prefetching - which the CPUs which have idiv support will have.
>
> With such features, the branch predictor is able to learn that the
> branch will be taken, and because of the speculative instruction
> fetching, it can bring the cache line in so that it has the
> instructions it needs with minimal or, if working correctly,
> without stalling the CPU pipeline.

It doesn't matter how many fancy features the CPU has.  Executing more
branches and using more cache lines puts additional pressure on those
resources, reducing overall performance.  Besides, the performance
counters readily show that the prediction is nothing near as perfect as
you seem to believe.

-- 
Måns Rullgård
mans@...sr.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists