lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Jul 2021 18:17:26 +0300
From:   Mike Rapoport <rppt@...nel.org>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH v14 000/138] Memory folios

On Tue, Jul 20, 2021 at 01:41:15PM +0100, Matthew Wilcox wrote:
> On Tue, Jul 20, 2021 at 01:54:38PM +0300, Mike Rapoport wrote:
> > Most of the changelogs (at least at the first patches) mention reduction of
> > the kernel size for your configuration on x86. I wonder, what happens if
> > you build the kernel with "non-distro" configuration, e.g. defconfig or
> > tiny.config?
> 
> I did an allnoconfig build and that reduced in size by ~2KiB.
> 
> > Also, what is the difference on !x86 builds?
> 
> I don't generally do non-x86 builds ... feel free to compare for
> yourself!

I did allnoconfig and defconfig for arm64 and powerpc.

All execpt arm64::defconfig show decrease by ~1KiB, while arm64::defconfig
was actually increased by ~500 bytes.

I didn't dig into objdumps yet.

I also tried to build arm but it failed with:

  CC      fs/remap_range.o
fs/remap_range.c: In function 'vfs_dedupe_file_range_compare':
fs/remap_range.c:250:3: error: implicit declaration of function 'flush_dcache_folio'; did you mean 'flush_cache_louis'? [-Werror=implicit-function-declaration]
  250 |   flush_dcache_folio(src_folio);
      |   ^~~~~~~~~~~~~~~~~~
      |   flush_cache_louis
cc1: some warnings being treated as errors


> I imagine it'll be 2-4 instructions per call to
> compound_head().  ie something like:
> 
> 	load page into reg S
> 	load reg S + 8 into reg T
> 	test bottom bit of reg T
> 	cond-move reg T - 1 to reg S
> becomes
> 	load folio into reg S
> 
> the exact spelling of those instructions will vary from architecture to
> architecture; some will take more instructions than others.  Possibly it
> means we end up using one fewer register and so reducing the number of
> registers spilled to the stack.  Probably not, though.

-- 
Sincerely yours,
Mike.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ