lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 11 Nov 2019 15:13:46 +0109
From:   Marc Zyngier <maz@...nel.org>
To:     Zhenyu Ye <yezhenyu2@...wei.com>
Cc:     Will Deacon <will@...nel.org>, <catalin.marinas@....com>,
        <suzuki.poulose@....com>, <mark.rutland@....com>,
        <tangnianyao@...wei.com>, <xiexiangyou@...wei.com>,
        <linux-kernel@...r.kernel.org>, <arm@...nel.org>
Subject: Re: [RFC PATCH v2] arm64: cpufeatures: add support for tlbi range  instructions

On 2019-11-11 14:56, Zhenyu Ye wrote:
> On 2019/11/11 21:27, Will Deacon wrote:
>> On Mon, Nov 11, 2019 at 09:23:55PM +0800, Zhenyu Ye wrote:
>>> ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
>>> range of input addresses. This patch adds support for this feature.
>>> This is the second version of the patch.
>>>
>>> I traced the __flush_tlb_range() for a minute and get some 
>>> statistical
>>> data as below:
>>>
>>> 	PAGENUM		COUNT
>>> 	1		34944
>>> 	2		5683
>>> 	3		1343
>>> 	4		7857
>>> 	5		838
>>> 	9		339
>>> 	16		933
>>> 	19		427
>>> 	20		5821
>>> 	23		279
>>> 	41		338
>>> 	141		279
>>> 	512		428
>>> 	1668		120
>>> 	2038		100
>>>
>>> Those data are based on kernel-5.4.0, where PAGENUM = end - start, 
>>> COUNT
>>> shows number of calls to the __flush_tlb_range() in a minute. There 
>>> only
>>> shows the data which COUNT >= 100. The kernel is started normally, 
>>> and
>>> transparent hugepage is opened. As we can see, though most user 
>>> TLBI
>>> ranges were 1 pages long, the num of long-range can not be ignored.
>>>
>>> The new feature of TLB range can improve lots of performance 
>>> compared to
>>> the current implementation. As an example, flush 512 ranges needs 
>>> only 1
>>> instruction as opposed to 512 instructions using current 
>>> implementation.
>>>
>>> And for a new hardware feature, support is better than not.
>>>
>>> Signed-off-by: Zhenyu Ye <yezhenyu2@...wei.com>
>>> ---
>>> ChangeLog v1 -> v2:
>>> - Change the main implementation of this feature.
>>> - Add some comments.
>>
>> How does this address my concerns here:
>>
>> 
>> https://lore.kernel.org/linux-arm-kernel/20191031131649.GB27196@willie-the-truck/
>>
>> ?
>>
>> Will
>>
>> .
>>
>
> I think your concern is more about the hardware level, and we can do
> nothing about
> this at all. The interconnect/DVM implementation is not exposed to
> software layer
> (and no need), and may should be constrained at hardware level.

You're missing the point here: the instruction may be implemented
and perfectly working at the CPU level, and yet not carried over
the interconnect. In this situation, other CPUs may not observe
the DVM messages instructing them of such invalidation, and you'll end
up with memory corruption.

So, in the absence of an architectural guarantee that range 
invalidation
is supported and observed by all the DVM agents in the system, there 
must
be a firmware description for it on which the kernel can rely.

         M.
-- 
Jazz is not dead. It just smells funny...

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ