linux-kernel - Re: [PATCH] block: optimise bvec_iter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191129232445.GA1331087@rani.riverdale.lan>
Date:   Fri, 29 Nov 2019 18:24:46 -0500
From:   Arvind Sankar <nivedita@...m.mit.edu>
To:     Pavel Begunkov <asml.silence@...il.com>
Cc:     Arvind Sankar <nivedita@...m.mit.edu>,
        Jens Axboe <axboe@...nel.dk>,
        Ming Lei <ming.lei@...onical.com>, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] block: optimise bvec_iter_advance()

On Sat, Nov 30, 2019 at 01:47:16AM +0300, Pavel Begunkov wrote:
> On 30/11/2019 01:17, Arvind Sankar wrote:
> > 
> > The loop can be simplified a bit further, as done has to be 0 once we go
> > beyond the current bio_vec. See below for the simplified version.
> > 
> 
> Thanks for the suggestion! I thought about it, and decided to not
> for several reasons. I prefer to not fine-tune and give compilers
> more opportunity to do their job. And it's already fast enough with
> modern architectures (MOVcc, complex addressing, etc).
> 
> Also need to consider code clarity and the fact, that this is inline,
> so should be brief and register-friendly.
> 

It should be more register-friendly, as it uses fewer variables, and I
think it's easier to see what the loop is doing, i.e. that we advance
one bio_vec per iteration: in the existing code, it takes a bit of
thinking to see that we won't spend more than one iteration within the
same bio_vec.

I don't see this as fine-tuning, rather simplifying the code. I do agree
that it's not going to make much difference for performance of the loop
itself, as the most common case I think is that we either stay in the
current bio_vec or advance by one.

> 
> > I also check if bi_size became zero so we can skip the rest of the
> > calculations in that case. If we want to preserve the current behavior of
> > updating iter->bi_idx and iter->bi_bvec_done even if bi_size is going to
> > become zero, the loop condition should change to
> > 
> > 		while (bytes && bytes >= cur->bv_len)
> 
> Probably, it's better to leave it in a consistent state. Shouldn't be
> a problem, but never know when and who will screw it up. 
> 

The WARN_ONCE case does leave it inconsistent, though that's not
supposed to happen, so less of a pitfall there.