linux-kernel - Re: [PATCH v6] mm/filemap: remove hugetlb special casing in filemap.c

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZN+z3Q3hAn/GG+d3@casper.infradead.org>
Date:   Fri, 18 Aug 2023 19:09:33 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Sidhartha Kumar <sidhartha.kumar@...cle.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        songmuchun@...edance.com, mike.kravetz@...cle.com
Subject: Re: [PATCH v6] mm/filemap: remove hugetlb special casing in filemap.c

On Fri, Aug 18, 2023 at 11:03:09AM -0700, Andrew Morton wrote:
> On Thu, 17 Aug 2023 11:18:36 -0700 Sidhartha Kumar <sidhartha.kumar@...cle.com> wrote:
> 
> > Perf was used to check the performance differences after the patch. Overall
> > the performance is similar to mainline with a very small larger overhead that
> > occurs in __filemap_add_folio() and hugetlb_add_to_page_cache(). This is because
> > of the larger overhead that occurs in xa_load() and xa_store() as the
> > xarray is now using more entries to store hugetlb folios in the page cache.
> 
> So... why merge the patch?  To save 40 lines of code?
> 
> I mean, if a patch which added 40 lines yielded a "very small"
> reduction in overhead, we'd probably merge it!
> 
> Or is there some wider reason for this which the changelog omitted?

Sidhartha's benchmarks are for hugetlbfs which is where we're likely
to see performance regressions.  What's not shown are any performance
numbers for !hugetlbfs.  The functions where this patch removes code
are used on _every_ page cache lookup, so given the typical difference
in number of lookups performed against hugetlb vs non-hugetlb files,
even saving 0.1% performance on non-hugetlb files will lead to fewer
instructions executed overall.

There's also a conceptual reduction in complexity.  We no longer need to
think about whether the inode is hugetlb or not before doing the lookup
and scaling the byte offset differently.