lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cb97ac46-33f3-d06e-53de-00945549e19b@arm.com>
Date:   Tue, 19 Dec 2017 11:18:07 +0000
From:   Marc Zyngier <marc.zyngier@....com>
To:     Andrzej Hajda <a.hajda@...sung.com>,
        Archit Taneja <architt@...eaurora.org>,
        Laurent Pinchart <Laurent.pinchart@...asonboard.com>
Cc:     dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/bridge: analogix_dp: Use relaxed MMIO accessors

On 19/12/17 08:53, Andrzej Hajda wrote:
> On 18.12.2017 12:39, Marc Zyngier wrote:
>> The analogix DP bridge is entierely driven via MMIO accesses, and
>> does not do any DMA that requires coherency with the CPU. Yet, the
>> driver uses the non-relaxed accessors, forcing strong barriers to
>> be emitted on architectures with a relaxed memory ordering.
>>
>> This is of course completely unnecessary, and only serves as a way
>> to pointlessly reduce the performance of unsuspecting platforms.
>>
>> Switch the driver to the _relaxed accessors, making my kevin platform
>> a slightly better machine.
> 
> Do you have any stats to justify this change?

I do (sort of): "hackbench 100 process 1000" is a almost a second faster
with that change (that's about 2% on something that is essentially a
scheduler benchmark) on my rk3399 system. Yes, this is a silly
benchmark, but that's one you can easily run on about anything.

> The common practice is/was to use writel/readl accessors to access MMIO,
> even if it is suboptimal in many cases. Has something changed in these
> practices?

Having it as a default is acceptable, unless you actually understand the
implications of the relaxed accessors and find that they have an impact
on the system.

To give you an idea of the magnitude of the impact on an arm64 system,
each barrier does:
- force the completion of any memory access in the whole system
- force the completion of any in-flight TLB invalidation and cache
maintenance in the whole system

When I say "in the whole system", it means CPUs, GPU, and any other
piece of HW connected on the system. This is effectively a "wait until
everything that was in-flight is done".

This is a widespread issue that takes time to address, unfortunately.

Now, I've definitely applied brute force to this driver, not really
knowing all the intricacies of the device. I haven't seen anything
bizarre there (it is mostly a bunch of RMWs), but I'd appreciate someone
who knows this IP to review it.

> To be clear, I am not against this change. I am just curious.

There is a really good article that explains all (and quite a bit more)
a kernel developer needs to know about MMIO and relaxed accessors here:

https://lwn.net/Articles/698014/

Hope this helps,

	M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ