lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20081203143112F.fujita.tomonori@lab.ntt.co.jp>
Date:	Wed, 3 Dec 2008 14:32:00 +0900
From:	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>
To:	mbroz@...hat.com
Cc:	jens.axboe@...cle.com, agk@...hat.com, neilb@...e.de,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] block: fix setting of max_segment_size and
 seg_boundary mask

On Mon, 01 Dec 2008 00:42:09 +0100
Milan Broz <mbroz@...hat.com> wrote:

> Fix setting of max_segment_size and seg_boundary mask for stacked md/dm devices.
> 
> When stacking devices (LVM over MD over SCSI) some of the request queue
> parameters are not set up correctly in some cases by default, namely
> max_segment_size and and seg_boundary mask.
> 
> If you create MD device over SCSI, these attributes are zeroed.
> 
> Problem become when there is over this mapping next device-mapper
> mapping - queue attributes are set in DM this way:
> 
> request_queue   max_segment_size  seg_boundary_mask
> SCSI                65536             0xffffffff
> MD RAID1                0                      0
> LVM                 65536                 -1 (64bit)
> 
> Unfortunately bio_add_page (resp. bio_phys_segments) calculates number
> of physical segments according to these parameters.
> 
> During the generic_make_request() is segment cout recalculated and
> can increase bio->bi_phys_segments count over the allowed limit.
> (After bio_clone() in stack operation.)
> 
> Thi is specially problem in CCISS driver, where it produce OOPS here
>     BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);
> (MAXSEGENTRIES is 31 by default.)
> 
> Sometimes even this command is enough to cause oops:
>   dd iflag=direct if=/dev/<vg>/<lv> of=/dev/null bs=128000 count=10
> 
> This command generates bios with 250 sectors, allocated in 32 4k-pages
> (last page uses only 1024 bytes).
> 
> For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
> unfortunatelly on lower layer it is recalculated to 32 segments and
> this violates CCISS restriction and triggers BUG_ON().
> 
> Attached patch tries to fix it by:
>  * initializing attributes above in queue request constructor
>    blk_queue_make_request()
> 
>  * make sure that blk_queue_stack_limits() inherits setting
> 
>  (DM uses its own function to set the limits because
>  it blk_queue_stack_limits() was introduced later.
>  It should probably switch to use generic stack limit function too.)
> 
>  * sets the default seg_boundary value in one place (blkdev.h)
> 
>  * use this mask as default in DM (instead of -1, which differs in 64bit)
> 
> Bugs related to this:
> https://bugzilla.redhat.com/show_bug.cgi?id=471639
> http://bugzilla.kernel.org/show_bug.cgi?id=8672
> 
> 
> Signed-off-by: Milan Broz <mbroz@...hat.com>

Looks like the correct fix.


> ---
>  block/blk-core.c       |    2 +-
>  block/blk-settings.c   |    4 ++++
>  drivers/md/dm-table.c  |    2 +-
>  include/linux/blkdev.h |    2 ++
>  4 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 10e8a64..b7d646c 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -592,7 +592,7 @@ blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
>  				   1 << QUEUE_FLAG_STACKABLE);
>  	q->queue_lock		= lock;
>  
> -	blk_queue_segment_boundary(q, 0xffffffff);
> +	blk_queue_segment_boundary(q, BLK_SEG_BOUNDARY_MASK);
>  
>  	blk_queue_make_request(q, __make_request);
>  	blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 41392fb..afa55e1 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -125,6 +125,9 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
>  	q->nr_requests = BLKDEV_MAX_RQ;
>  	blk_queue_max_phys_segments(q, MAX_PHYS_SEGMENTS);
>  	blk_queue_max_hw_segments(q, MAX_HW_SEGMENTS);
> +	blk_queue_segment_boundary(q, BLK_SEG_BOUNDARY_MASK);
> +	blk_queue_max_segment_size(q, MAX_SEGMENT_SIZE);

Is it a bit strange to set segment_boundary and max_segment_size that
are hardware restrictions in a function used for mostly virtual
devices? max_hw_segments is also hardware restriction though.

It might make sense to share the setting initialization code between
blk_init_queue_node and blk_queue_make_request.


>  	q->make_request_fn = mfn;
>  	q->backing_dev_info.ra_pages =
>  			(VM_MAX_READAHEAD * 1024) / PAGE_CACHE_SIZE;
> @@ -314,6 +317,7 @@ void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b)
>  	/* zero is "infinity" */
>  	t->max_sectors = min_not_zero(t->max_sectors, b->max_sectors);
>  	t->max_hw_sectors = min_not_zero(t->max_hw_sectors, b->max_hw_sectors);
> +	t->seg_boundary_mask = min_not_zero(t->seg_boundary_mask, b->seg_boundary_mask);
>  
>  	t->max_phys_segments = min(t->max_phys_segments, b->max_phys_segments);
>  	t->max_hw_segments = min(t->max_hw_segments, b->max_hw_segments);

Theoretically, blk_queue_stack_limits() better use min_not_zero
instead of min for max_phys_segments, max_hw_segments, and
max_segment_size?


> diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
> index a63161a..04e5fd7 100644
> --- a/drivers/md/dm-table.c
> +++ b/drivers/md/dm-table.c
> @@ -668,7 +668,7 @@ static void check_for_valid_limits(struct io_restrictions *rs)
>  	if (!rs->max_segment_size)
>  		rs->max_segment_size = MAX_SEGMENT_SIZE;
>  	if (!rs->seg_boundary_mask)
> -		rs->seg_boundary_mask = -1;
> +		rs->seg_boundary_mask = BLK_SEG_BOUNDARY_MASK;
>  	if (!rs->bounce_pfn)
>  		rs->bounce_pfn = -1;
>  }
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index a135256..9573945 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -921,6 +921,8 @@ extern void blk_set_cmd_filter_defaults(struct blk_cmd_filter *filter);
>  
>  #define MAX_SEGMENT_SIZE	65536
>  
> +#define BLK_SEG_BOUNDARY_MASK	0xFFFFFFFFUL
> +
>  #define blkdev_entry_to_request(entry) list_entry((entry), struct request, queuelist)
>  
>  static inline int queue_hardsect_size(struct request_queue *q)
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ