[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231116154406.GDZVY4xmFvRQt0wGGE@fat_crate.local>
Date: Thu, 16 Nov 2023 16:44:06 +0100
From: Borislav Petkov <bp@...en8.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: David Howells <dhowells@...hat.com>,
kernel test robot <oliver.sang@...el.com>,
oe-lkp@...ts.linux.dev, lkp@...el.com,
linux-kernel@...r.kernel.org,
Christian Brauner <brauner@...nel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
Christian Brauner <christian@...uner.io>,
Matthew Wilcox <willy@...radead.org>,
David Laight <David.Laight@...lab.com>, ying.huang@...el.com,
feng.tang@...el.com, fengwei.yin@...el.com
Subject: Re: [linus:master] [iov_iter] c9eec08bac: vm-scalability.throughput
-16.9% regression
On Wed, Nov 15, 2023 at 02:26:02PM -0500, Linus Torvalds wrote:
> So the real issue is that we don't want an inlined memcpy at all,
> unless it's the simple constant-sized case that has been turned into
> individual moves with no loop.
>
> Or it's a "rep movsb" with FSRM as a CPUID-based alternative, of course.
Reportedly and apparently, this pretty much addresses the issue at hand.
However, I'd still like for the compiler to handle the small length
cases by issuing plain MOVs instead of blindly doing "call memcpy".
Lemme see how it would work with your patch...
diff --git a/Makefile b/Makefile
index ede0bd241056..94d93070d54a 100644
--- a/Makefile
+++ b/Makefile
@@ -996,6 +996,8 @@ endif
# change __FILE__ to the relative path from the srctree
KBUILD_CPPFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)
+KBUILD_CFLAGS += $(call cc-option,-mstringop-strategy=libcall)
+
# include additional Makefiles when needed
include-y := scripts/Makefile.extrawarn
include-$(CONFIG_DEBUG_INFO) += scripts/Makefile.debug
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists