linux-kernel - Re: next bio iters break discard?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.11.1401161129380.1321@eggly.anvils>
Date:	Thu, 16 Jan 2014 12:21:10 -0800 (PST)
From:	Hugh Dickins <hughd@...gle.com>
To:	Kent Overstreet <kmo@...erainc.com>
cc:	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Jens Axboe <axboe@...nel.dk>, Shaohua Li <shli@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: next bio iters break discard?

On Tue, 14 Jan 2014, Kent Overstreet wrote:
> 
> Does the below patch look like what we want? I'm assuming that if

You don't fill me with confidence ;)

> multiple WRITE_SAME bios are merged, since they're all writing the same
> data we can consider the entire request to be a single segment.
> 
> commit 1755e7ffc5745591d37b8956ce2512f4052a104a
> Author: Kent Overstreet <kmo@...erainc.com>
> Date:   Tue Jan 14 14:22:01 2014 -0800
> 
>     block: Explicitly handle discard/write same when counting segments
> 
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 8f8adaa..7d977f8 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -21,6 +21,12 @@ static unsigned int __blk_recalc_rq_segments(struct request_queue *q,
>  	if (!bio)
>  		return 0;
>  
> +	if (bio->bi_rw & REQ_DISCARD)
> +		return 0;
> +
> +	if (bio->bi_rw & REQ_WRITE_SAME)
> +		return 1;
> +
>  	fbio = bio;
>  	cluster = blk_queue_cluster(q);
>  	seg_size = 0;

For me this just shifts the crash,
from __blk_recalc_rq_segments() to blk_rq_map_sg():

blk_rq_map_sg
scsi_init_sgtable
scsi_init_io
scsi_setup_blk_pc_cmnd
sd_prep_fn
blk_peek_request
scsi_request_fn
__blk_run_queue
blk_run_queue
scsi_run_queue
scsi_next_command
scsi_io_completion
scsi_finish_command
scsi_softirq_done
blk_done_softirq
__do_softirq
irq_exit
do_IRQ
common_interrupt
<EOI>
cpuidle_idle_call
arch_cpu_idle
cpu_startup_entry
start_secondary

It's GPF'ing on struct scatter_list *sg 0x800000001473e064 in

static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
{
	unsigned long page_link = sg->page_link & 0x3;

It appears to be in the static inline __blk_segment_map_sg(),
and that GPF'ing address is what it just got from sg_next().

Sorry, this isn't the kind of dump you'll be used to, but it's the
best I can do at the moment, and I've just had to reboot the machine.

O, tried again and it hit the BUG_ON(count > sdb->table.nents)
on line 1048 of drivers/scsi/scsi_lib.c:

scsi_init_sgtable
<IRQ> scsi_init_io
scsi_setup_blk_pc_cmnd
sd_setup_discard_cmnd
sd_prep_fn
blk_peek_request
etc. as before

I'll have to leave the machine shortly - I'm rather hoping
you can do your own discard testing to see such crashes.

Thanks,
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/