linux-kernel - Re: [PATCH RESEND v2] block: modify __bio_add_page check to accept pages that don't start a new segment

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <51506EBA.8060708@redhat.com>
Date:	Mon, 25 Mar 2013 16:35:22 +0100
From:	Jan Vesely <jvesely@...hat.com>
To:	Jens Axboe <axboe@...nel.dk>
CC:	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org,
	linux-fsdevel@...r.kernel.org,
	Alexander Viro <viro@...iv.linux.org.uk>,
	fujita.tomonori@....ntt.co.jp,
	Kai Mäkisara <kai.makisara@...umbus.fi>,
	James Bottomley <james.bottomley@...senpartnership.com>
Subject: Re: [PATCH RESEND v2] block: modify __bio_add_page check to accept
 pages that don't start a new segment

On Mon 25 Mar 2013 15:24:57 CET, Jens Axboe wrote:
> On Mon, Mar 25 2013, Jan Vesely wrote:
>> v2: changed a comment
>>
>> The original behavior was to refuse all pages after the maximum number of
>> segments has been reached. However, some drivers (like st) craft their buffers
>> to potentially require exactly max segments and multiple pages in the last
>> segment. This patch modifies the check to allow pages that can be merged into
>> the last segment.
>>
>> Fixes EBUSY failures when using large tape block size in high
>> memory fragmentation condition.
>> This regression was introduced by commit
>>  46081b166415acb66d4b3150ecefcd9460bb48a1
>>  st: Increase success probability in driver buffer allocation
>>
>> Signed-off-by: Jan Vesely <jvesely@...hat.com>
>>
>> CC: Alexander Viro <viro@...iv.linux.org.uk>
>> CC: FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>
>> CC: Kai Makisara <kai.makisara@...umbus.fi>
>> CC: James Bottomley <james.bottomley@...senpartnership.com>
>> CC: Jens Axboe <axboe@...nel.dk>
>> CC: stable@...r.kernel.org
>> ---
>>  fs/bio.c | 27 +++++++++++++++++----------
>>  1 file changed, 17 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/bio.c b/fs/bio.c
>> index bb5768f..bc6af71 100644
>> --- a/fs/bio.c
>> +++ b/fs/bio.c
>> @@ -500,7 +500,6 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>>  			  *page, unsigned int len, unsigned int offset,
>>  			  unsigned short max_sectors)
>>  {
>> -	int retried_segments = 0;
>>  	struct bio_vec *bvec;
>>
>>  	/*
>> @@ -551,18 +550,13 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>>  		return 0;
>>
>>  	/*
>> -	 * we might lose a segment or two here, but rather that than
>> -	 * make this too complex.
>> +	 * The first part of the segment count check,
>> +	 * reduce segment count if possible
>>  	 */
>>
>> -	while (bio->bi_phys_segments >= queue_max_segments(q)) {
>> -
>> -		if (retried_segments)
>> -			return 0;
>> -
>> -		retried_segments = 1;
>> +	if (bio->bi_phys_segments >= queue_max_segments(q))
>>  		blk_recount_segments(q, bio);
>> -	}
>> +
>>
>>  	/*
>>  	 * setup the new entry, we might clear it again later if we
>> @@ -572,6 +566,19 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
>>  	bvec->bv_page = page;
>>  	bvec->bv_len = len;
>>  	bvec->bv_offset = offset;
>> +	
>> +	/*
>> +	 * the other part of the segment count check, allow mergeable pages
>> +	 */
>> +	if ((bio->bi_phys_segments > queue_max_segments(q)) ||
>> +		( (bio->bi_phys_segments == queue_max_segments(q)) &&
>> +		!BIOVEC_PHYS_MERGEABLE(bvec - 1, bvec))) {
>> +			bvec->bv_page = NULL;
>> +			bvec->bv_len = 0;
>> +			bvec->bv_offset = 0;
>> +			return 0;
>> +	}
>> +
>
> This is a bit messy, I think. bi_phys_segments should never be allowed
> to go beyond queue_ma_segments(), so the > test does not look right.
> Maybe it's an artifact of when we fall through with this patch, we bump
> bi_phys_segments even if the segments are physicall contig and
> mergeable.

yeah. it is messy, I tried to go for the least invasive changes.

I took the '>' test from the original while loop '>='. The original 
behavior
guaranteed bio->bi_phys_segments <= max_segments, if the bio satisfied
this condition to begin with.
I did not find any guarantees that the 'bio' parameter of this function 
has
to satisfy this condition in general.

My understanding is that if a caller of this function (or one of the 
two that call this one)
provides an invalid (segment-count-wise) bio, it will fail (return 0 
added length),
and let the caller handle the situation.
I admit, I did not check all the call paths that use these functions.

>
> What happens when the segment is physically mergeable, but the resulting
> merged segment is too large (bigger than q->limits.max_segment_size)?
>

ah, yes. I guess I need a check that follows __blk_recalc_rq_segments 
more closely.
We know that at this point all pages are merged into segments, so a 
helper function that would be used
by both  __blk_recalc_rq_segments and this check is possible.


I still assume that a temporary increase of bi_phys_segments above 
max_segments is ok.
If we want to avoid this situation we would need to merge tail pages 
right away. That's imo uglier.

thanks
--
Jan Vesely <jvesely@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/