lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Wed, 24 Jan 2024 22:28:54 +0800
From: Baokun Li <libaokun1@...wei.com>
To: <linux-fsdevel@...r.kernel.org>
CC: <torvalds@...ux-foundation.org>, <viro@...iv.linux.org.uk>,
	<brauner@...nel.org>, <jack@...e.cz>, <willy@...radead.org>,
	<akpm@...ux-foundation.org>, <arnd@...db.de>, <linux-arch@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <yi.zhang@...wei.com>,
	<yangerkun@...wei.com>, <yukuai3@...wei.com>, <libaokun1@...wei.com>
Subject: [PATCH v2 0/3] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()

V1->V2:
  Add patch 3 to fix an error when compiling code for 32-bit architectures
  without CONFIG_SMP enabled.

This patchset follows the Linus suggestion to make the i_size_read/write
helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
in filemap_read() is no longer needed, so it is removed. And remove the
extra type checking in smp_load_acquire/smp_store_release under the
!CONFIG_SMP case to avoid compilation errors.

Functional tests were performed and no new problems were found.

Here are the results of unixbench tests based on 6.7.0-next-20240118 on
arm64, with some degradation in single-threading and some optimization in
multi-threading, but overall the impact is not significant.

### 72 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values        |   base  | patched |  cmp   |
--------------------------------------|---------|---------|--------|
Dhrystone 2 using register variables  | 3635.06 | 3596.3  | -1.07% |
Double-Precision Whetstone            | 808.58  | 808.58  | 0.00%  |
Execl Throughput                      | 623.52  | 618.1   | -0.87% |
File Copy 1024 bufsize 2000 maxblocks | 1715.82 | 1668.58 | -2.75% |
File Copy 256 bufsize 500 maxblocks   | 1320.98 | 1250.16 | -5.36% |
File Copy 4096 bufsize 8000 maxblocks | 2639.36 | 2488.48 | -5.72% |
Pipe Throughput                       | 869.06  | 872.3   | 0.37%  |
Pipe-based Context Switching          | 106.26  | 117.22  | 10.31% |
Process Creation                      | 247.72  | 246.74  | -0.40% |
Shell Scripts (1 concurrent)          | 1234.98 | 1226    | -0.73% |
Shell Scripts (8 concurrent)          | 6893.96 | 6210.46 | -9.91% |
System Call Overhead                  | 493.72  | 494.28  | 0.11%  |
--------------------------------------|---------|---------|--------|
Total                                 | 1003.92 | 989.58  | -1.43% |

### 72 CPUs in system; running 72 parallel copy of tests
System Benchmarks Index Values        |   base    |  patched  |  cmp   |
--------------------------------------|-----------|-----------|--------|
Dhrystone 2 using register variables  | 260471.88 | 258065.04 | -0.92% |
Double-Precision Whetstone            | 58212.32  | 58219.3   | 0.01%  |
Execl Throughput                      | 6954.7    | 7444.08   | 7.04%  |
File Copy 1024 bufsize 2000 maxblocks | 64244.74  | 64618.24  | 0.58%  |
File Copy 256 bufsize 500 maxblocks   | 89933.8   | 87026.38  | -3.23% |
File Copy 4096 bufsize 8000 maxblocks | 79808.14  | 81916.42  | 2.64%  |
Pipe Throughput                       | 62174.38  | 62389.74  | 0.35%  |
Pipe-based Context Switching          | 27239.28  | 27887.24  | 2.38%  |
Process Creation                      | 3551.28   | 3800.54   | 7.02%  |
Shell Scripts (1 concurrent)          | 19212.26  | 20749.34  | 8.00%  |
Shell Scripts (8 concurrent)          | 20842.02  | 21958.12  | 5.36%  |
System Call Overhead                  | 35328.24  | 35451.68  | 0.35%  |
--------------------------------------|-----------|-----------|--------|
Total                                 | 35592.42  | 36450.36  | 2.41%  |

Baokun Li (3):
  fs: make the i_size_read/write helpers be
    smp_load_acquire/store_release()
  Revert "mm/filemap: avoid buffered read/write race to read
    inconsistent data"
  asm-generic: remove extra type checking in acquire/release for non-SMP
    case

 include/asm-generic/barrier.h |  2 --
 include/linux/fs.h            | 10 ++++++++--
 mm/filemap.c                  |  9 ---------
 3 files changed, 8 insertions(+), 13 deletions(-)

-- 
2.31.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ