lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 26 Jun 2017 10:46:34 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Ming Lei <ming.lei@...hat.com>,
        Christoph Hellwig <hch@...radead.org>,
        Huang Ying <ying.huang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>
Cc:     linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2 00/51] block: support multipage bvec

On 06/26/2017 06:09 AM, Ming Lei wrote:
> Hi,
> 
> This patchset brings multipage bvec into block layer:
> 
> 1) what is multipage bvec?
> 
> Multipage bvecs means that one 'struct bio_bvec' can hold
> multiple pages which are physically contiguous instead
> of one single page used in linux kernel for long time.
> 
> 2) why is multipage bvec introduced?
> 
> Kent proposed the idea[1] first. 
> 
> As system's RAM becomes much bigger than before, and 
> at the same time huge page, transparent huge page and
> memory compaction are widely used, it is a bit easy now
> to see physically contiguous pages from fs in I/O.
> On the other hand, from block layer's view, it isn't
> necessary to store intermediate pages into bvec, and
> it is enough to just store the physicallly contiguous
> 'segment' in each io vector.
> 
> Also huge pages are being brought to filesystem and swap
> [2][6], we can do IO on a hugepage each time[3], which
> requires that one bio can transfer at least one huge page
> one time. Turns out it isn't flexiable to change BIO_MAX_PAGES
> simply[3][5]. Multipage bvec can fit in this case very well.
> 
> With multipage bvec:
> 
> - segment handling in block layer can be improved much
> in future since it should be quite easy to convert
> multipage bvec into segment easily. For example, we might 
> just store segment in each bvec directly in future.
> 
> - bio size can be increased and it should improve some
> high-bandwidth IO case in theory[4].
> 
> - Inside block layer, both bio splitting and sg map can
> become more efficient than before by just traversing the
> physically contiguous 'segment' instead of each page.
> 
> - there is opportunity in future to improve memory footprint
> of bvecs. 
> 
> 3) how is multipage bvec implemented in this patchset?
> 
> The 1st 18 patches comment on some special cases and deal with
> some special cases of direct access to bvec table.
> 
> The 2nd part(19~29) implements multipage bvec in block layer:
> 
> 	- put all tricks into bvec/bio/rq iterators, and as far as
> 	drivers and fs use these standard iterators, they are happy
> 	with multipage bvec
> 
> 	- use multipage bvec to split bio and map sg
> 
> 	- bio_for_each_segment_all() changes
> 	this helper pass pointer of each bvec directly to user, and
> 	it has to be changed. Two new helpers(bio_for_each_segment_all_sp()
> 	and bio_for_each_segment_all_mp()) are introduced. 
> 
> The 3rd part(30~49) convert current users of bio_for_each_segment_all()
> to bio_for_each_segment_all_sp()/bio_for_each_segment_all_mp().
> 
> The last part(50~51) enables multipage bvec.
> 
> These patches can be found in the following git tree:
> 
> 	https://github.com/ming1/linux/commits/mp-bvec-1.4-v4.12-rc
> 
> Thanks Christoph for looking at the early version and providing
> very good suggestions, such as: introduce bio_init_with_vec_table(),
> remove another unnecessary helpers for cleanup and so on.
> 
> Any comments are welcome!

I'll take some time to review this over the next week or so. In any
case, it's a little late to stuff into 4.13 and get a decent amount
of exposure and testing on it. A 4.14 target for this would be the
way to go, imho.


-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ