lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1360313046-9876-1-git-send-email-wenqing.lz@taobao.com>
Date:	Fri,  8 Feb 2013 16:43:56 +0800
From:	Zheng Liu <gnehzuil.liu@...il.com>
To:	linux-ext4@...r.kernel.org
Cc:	Zheng Liu <wenqing.lz@...bao.com>, "Theodore Ts'o" <tytso@....edu>,
	Jan kara <jack@...e.cz>
Subject: [PATCH 00/10 v5] ext4: extent status tree (step2)

Hi all,

This is my fifth try to implement the second step of extent status tree.
The patch set can be divided into the following parts.

Patch 1/10
  This patch refines the extent status tree

Patch 2/10-6/10
  These patches try to track all extent status in extent status tree and
make it as a extent cache.  In extent_status structure bit field is removed
because we get some warnings from 'sparse'.  Now es_pblk and es_status are
manipulated by ext4_es_*_pblock and ext4_es_*_status directly.  Currently
when an unwritten extent is allocated, we never know it from map->m_flags
because ext4_ext_map_blocks doesn't return EXT4_MAP_UNWRITTEN flag.  A
patch fixes it and we can determine the extent status according to m_flags.
  According to Jan's feedback, we put the hole into extent cache to avoid
to access extent tree in disk as far as possible.  Here if the whole file
is a hole, this hole will not be cached in extent status tree because it
is always splitted immediately.  Meanwhile the hole will not be cached
when ext4_da_map_blocks looks up a block mapping because this hole will be
as a delayed extent later.

Patch 7/10-8/10
  This two patches try to reclaim memory from extent status tree when we
are under a high memeory pressure.

Patch 9/10-10/10
  Thses patches are picked up again from 1st version because I aware that
they could remove a bogus wait in ext4_ind_direct_IO when dioread_nolock
is enabled.  After applied them, the latency of dio read can be reduced.

I measure it using fio and the result shows as below.

config file
-----------
[global]
ioengine=psync
direct=1
bs=4k
thread
group_reporting
directory=/mnt/sda1/
filename=testfile
filesize=10g
size=10g
runtime=120
iodepth=16

[fio]
rw=randrw
numjobs=4

result
------
w/ bogus wait
  read : io=1508.1MB, bw=12876KB/s, iops=3218 , runt=120001msec
    clat (usec): min=128 , max=268738 , avg=718.62, stdev=3703.97
     lat (usec): min=128 , max=268739 , avg=718.78, stdev=3703.97
  write: io=1505.2MB, bw=12843KB/s, iops=3210 , runt=120001msec
    clat (usec): min=47 , max=991727 , avg=520.94, stdev=3451.63
     lat (usec): min=47 , max=991727 , avg=521.31, stdev=3451.63

w/o bogus wait
  read : io=1576.4MB, bw=13451KB/s, iops=3362 , runt=120001msec
    clat (usec): min=128 , max=283906 , avg=685.88, stdev=2762.64
     lat (usec): min=128 , max=283907 , avg=686.05, stdev=2762.64
  write: io=1577.9MB, bw=13458KB/s, iops=3364 , runt=120001msec
    clat (usec): min=48 , max=977942 , avg=498.97, stdev=3093.08
     lat (usec): min=48 , max=977943 , avg=499.33, stdev=3093.08

>From the result we can see that the avg. of latency could be reduced a little.

changelog:
v5 <- v4:
 - drop a patch that removes EXT4_MAP_FROM_CLUSTER flag
   (I will revise it in the patch set of get_block_t refinement)
 - fold original patch 3/9 into patch 4/9
 - manipulate es_pblk and es_status directly
   (bit field is removed because it causes some warnings from 'sparse')
 - let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag
 - rename ext4_es_find_extent with ext4_es_find_delayed_extent
 - add hole status and put hole into extent status tree as a cache
 - convert unwritten extents from extent status tree in ext4_ext_direct_IO
   and end_io callback
 - remove a bogus wait in ext4_ind_direct_IO when dioread_nolock is enabled

v4 <- v3:
 - register a normal shrinker to reclaim extent from extent status tree

v3 <- v2:
 - use prune_super() to reclaim extents from extent status tree
 - stashed es_status into es_pblk
 - remove single extent cache
 - rebase against 3.8-rc4

v2 <- v1:
 - drop patches that try to improve unwritten extent conversion
 - remove EXT4_MAP_FROM_CLUSTER flag
 - add tracepoint for ext4_es_lookup_extent()
 - drop a patch, which tries to fix a warning when bigalloc and delalloc
   are enabled
 - add a shrinker to reclaim memory from extent status tree
 - rebase against 3.8-rc2

v4: http://lwn.net/Articles/536037/
v3: http://lwn.net/Articles/533730/
v2: http://lwn.net/Articles/532446/
v1: http://lwn.net/Articles/531065/

As always, any comments or feedbacks are welcome.

FWIW, when I try to implement patch 3/10, I realize that get_block_t and
*_map_blocks functions need to be refactored because in ext4 we already
have six get_block_t functions
 - ext4_get_block
 - ext4_get_block_write
 - ext4_get_block_write_nolock
 - noalloc_get_block_write
 - ext4_da_get_block_prep
 - _ext4_get_block

and four *_map_blocks
 - ext4_map_blocks
 - ext4_da_map_blocks
 - ext4_ext_map_blocks
 - ext4_ind_map_blocks

So I am planning to refine them.  First I will try to split ext4_map_blocks
into two parts, e.g. ext4_map_blocks_read and ext4_map_blocks_write, and 
then try other cleanups and improvmentes.

Thanks,
						- Zheng

Zheng Liu (10):
  ext4: refine extent status tree
  ext4: add physical block and status member into extent status tree
  ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag
  ext4: track all extent status in extent status tree
  ext4: lookup block mapping in extent status tree
  ext4: remove single extent cache
  ext4: adjust some functions for reclaiming extents from extent status
    tree
  ext4: reclaim extents from extent status tree
  ext4: convert unwritten extents from extent status tree in end_io
  ext4: remove bogus wait for unwritten extents in ext4_ind_direct_IO

 fs/ext4/ext4.h              |  21 +-
 fs/ext4/ext4_extents.h      |   6 -
 fs/ext4/extents.c           | 211 ++++--------
 fs/ext4/extents_status.c    | 779 +++++++++++++++++++++++++++++++++++---------
 fs/ext4/extents_status.h    |  84 ++++-
 fs/ext4/file.c              |  16 +-
 fs/ext4/indirect.c          |   5 -
 fs/ext4/inode.c             | 148 +++++++--
 fs/ext4/move_extent.c       |   3 -
 fs/ext4/page-io.c           |   8 +-
 fs/ext4/super.c             |   8 +-
 include/trace/events/ext4.h | 207 ++++++++++--
 12 files changed, 1075 insertions(+), 421 deletions(-)

-- 
1.7.12.rc2.18.g61b472e

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ