lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1398933888-4940-1-git-send-email-mgorman@suse.de>
Date:	Thu,  1 May 2014 09:44:31 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	Linux-MM <linux-mm@...ck.org>,
	Linux-FSDevel <linux-fsdevel@...r.kernel.org>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Vlastimil Babka <vbabka@...e.cz>, Jan Kara <jack@...e.cz>,
	Michal Hocko <mhocko@...e.cz>, Hugh Dickins <hughd@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: [PATCH 00/17] Misc page alloc, shmem, mark_page_accessed and page_waitqueue optimisations

I was investigating a performance bug that looked like dd to tmpfs
had regressed.  The bulk of the problem turned out to be a difference
in Kconfig but it got me looking at the unnecessary overhead in tmpfs,
mark_page_accessed and parts of the allocator. This series is the result.

The patches themselves have details of the performance results but here
are some of the results based on ext4. This is the result of dd'ing to
a file multiple times on tmpfs

loopdd Throughput
                     3.15.0-rc3            3.15.0-rc3
                        vanilla        lockpage-v2r33
Min         4096.0000 (  0.00%)   3891.2000 ( -5.00%)
Mean        4840.1067 (  0.00%)   5154.1333 (  6.49%)
TrimMean    4867.6571 (  0.00%)   5204.1143 (  6.91%)
Stddev       160.6807 (  0.00%)    275.1917 ( 71.27%)
Max         5017.6000 (  0.00%)   5324.8000 (  6.12%)

loopdd elapsed time
                            3.15.0-rc3            3.15.0-rc3
                               vanilla        lockpage-v2r33
Min      elapsed      0.4100 (  0.00%)      0.3900 (  4.88%)
Mean     elapsed      0.4780 (  0.00%)      0.4203 ( 12.06%)
TrimMean elapsed      0.4796 (  0.00%)      0.4179 ( 12.88%)
Stddev   elapsed      0.0353 (  0.00%)      0.0379 ( -7.23%)
Max      elapsed      0.5100 (  0.00%)      0.4800 (  5.88%)

This table shows the latency in usecs of accessing ext4-backed
mappings of various sizes

lat_mmap
                       3.15.0-rc3            3.15.0-rc3
                          vanilla        lockpage-v2r33
Procs 107M     557.0000 (  0.00%)    544.0000 (  2.33%)
Procs 214M    1150.0000 (  0.00%)   1058.0000 (  8.00%)
Procs 322M    1897.0000 (  0.00%)   1554.0000 ( 18.08%)
Procs 429M    2188.0000 (  0.00%)   2652.0000 (-21.21%)
Procs 536M    2622.0000 (  0.00%)   2473.0000 (  5.68%)
Procs 644M    3065.0000 (  0.00%)   2486.0000 ( 18.89%)
Procs 751M    3400.0000 (  0.00%)   3012.0000 ( 11.41%)
Procs 859M    3996.0000 (  0.00%)   3926.0000 (  1.75%)
Procs 966M    4646.0000 (  0.00%)   3763.0000 ( 19.01%)
Procs 1073M   4981.0000 (  0.00%)   4154.0000 ( 16.60%)
Procs 1181M   5419.0000 (  0.00%)   5152.0000 (  4.93%)
Procs 1288M   5553.0000 (  0.00%)   5538.0000 (  0.27%)
Procs 1395M   5841.0000 (  0.00%)   5730.0000 (  1.90%)
Procs 1503M   6225.0000 (  0.00%)   5981.0000 (  3.92%)
Procs 1610M   6558.0000 (  0.00%)   6332.0000 (  3.45%)
Procs 1717M   7130.0000 (  0.00%)   6741.0000 (  5.46%)
Procs 1825M   9394.0000 (  0.00%)   8483.0000 (  9.70%)
Procs 1932M   8056.0000 (  0.00%)   9427.0000 (-17.02%)
Procs 2040M   8463.0000 (  0.00%)   9030.0000 ( -6.70%)
Procs 2147M   9014.0000 (  0.00%)   8608.0000 (  4.50%)

In general the system CPU overhead is lower.

 arch/tile/mm/homecache.c        |   2 +-
 fs/btrfs/extent_io.c            |  11 +-
 fs/btrfs/file.c                 |   5 +-
 fs/buffer.c                     |   7 +-
 fs/ext4/mballoc.c               |  14 +-
 fs/f2fs/checkpoint.c            |   3 -
 fs/f2fs/node.c                  |   2 -
 fs/fuse/dev.c                   |   2 +-
 fs/fuse/file.c                  |   2 -
 fs/gfs2/aops.c                  |   1 -
 fs/gfs2/meta_io.c               |   4 +-
 fs/ntfs/attrib.c                |   1 -
 fs/ntfs/file.c                  |   1 -
 include/linux/cpuset.h          |  31 +++++
 include/linux/gfp.h             |   4 +-
 include/linux/mmzone.h          |  22 ++-
 include/linux/page-flags.h      |  18 +++
 include/linux/pageblock-flags.h |  34 ++++-
 include/linux/pagemap.h         | 115 ++++++++++++++--
 include/linux/swap.h            |   9 +-
 kernel/cpuset.c                 |   8 +-
 kernel/sched/wait.c             |   3 +-
 mm/filemap.c                    | 292 ++++++++++++++++++++++------------------
 mm/page_alloc.c                 | 226 ++++++++++++++++++-------------
 mm/shmem.c                      |   8 +-
 mm/swap.c                       |  17 ++-
 mm/swap_state.c                 |   2 +-
 mm/vmscan.c                     |   6 +-
 28 files changed, 556 insertions(+), 294 deletions(-)

-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ