linux-kernel - Re: [v3 0/3] Reduce TLB flushes under some specific conditions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <08c82a91-87d1-42c7-93c4-4028f3725340@intel.com>
Date:   Mon, 30 Oct 2023 10:55:07 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Byungchul Park <byungchul@...com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Cc:     kernel_team@...ynix.com, akpm@...ux-foundation.org,
        ying.huang@...el.com, namit@...are.com, xhao@...ux.alibaba.com,
        mgorman@...hsingularity.net, hughd@...gle.com, willy@...radead.org,
        david@...hat.com, peterz@...radead.org, luto@...nel.org,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
        dave.hansen@...ux.intel.com
Subject: Re: [v3 0/3] Reduce TLB flushes under some specific conditions

On 10/30/23 00:25, Byungchul Park wrote:
> I'm suggesting a mechanism to reduce TLB flushes by keeping source and
> destination of folios participated in the migrations until all TLB
> flushes required are done, only if those folios are not mapped with
> write permission PTE entries at all. I worked Based on v6.6-rc5.

There's a lot of common overhead here, on top of the complexity in general:

 * A new page flag
 * A new cpumask_t in task_struct
 * A new zone list
 * Extra (temporary) memory consumption

and the benefits are ... "performance improved a little bit" on one
workload.  That doesn't seem like a good overall tradeoff to me.

There will certainly be workloads that, before this patch, would have
little or no memory pressure and after this patch would need to do reclaim.

Also, looking with my arch/x86 hat on, there's really nothing
arch-specific here.  Please try to keep stuff out of arch/x86 unless
it's very much arch-specific.

The connection between the arch-generic TLB flushing and
__flush_tlb_local() seems quite tenuous.  __flush_tlb_local() is, to me,
quite deep in the implementation and there are quite a few ways that a
random TLB flush might not end up there.  In other words, I'm not saying
that this is broken, but it's not clear at all to me how it functions
reliably.