lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080702082511.GF20055@kernel.dk>
Date:	Wed, 2 Jul 2008 10:25:11 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Mikulas Patocka <mpatocka@...hat.com>
Cc:	linux-kernel@...r.kernel.org, Neil Brown <neilb@...e.de>
Subject: Re: [PATCH 1/2] Avoid bio_endio recursion

On Wed, Jul 02 2008, Mikulas Patocka wrote:
> >Right, that wont work of course. Completions are typically done through
> >a softirq, so it is not currently done with hard interrupts disabled.
> 
> I thought, from hardirq - that's what IDE is doing. And they are called 
> with interrupts disabled (maybe unless you specify unmaskirq, which is not 
> default). What block driver does completions with softirq? ... and why?

The key word is 'typically', the old IDE driver really isn't used very
much. The SCSI layer and eg cciss uses the block layer softirq
completions, so that is what 99% of the uses will be.

> >So it's a bit of a shame to impose this extra restriction, bio unrolling 
> >is necessarily a really short operation. We could reenable around the 
> >bi_end_io() call, but then we'd need to save and restore for each bio.
> 
> I found another weird thing --- why does local_irq_save() execute that 
> microcoded "cli" instruction even if interrupts are already disabled? That 
> could be definitely optimized.
> 
> And does local_irq_restore() need to execute even more costy "popf" when 
> it makes a transition from disabled to disabled state? What's 
> local_irq_restore semantics --- is it allowed to use local_irq_restore for 
> transition from interrupt-enabled state into interrupt-disabled state?

That's a question for the arch people. Probably nested _save and
_restore aren't optimized.

> 
> >Please make that
> >
> >       bio->bi_flags &= ~(1 << BIO_SEG_VALID);
> >       bio->bi_phys_segments = error;
> 
> OK, here is the updated patch:
> 
> Avoid recursion on bio_endio. bio_endio calls bio->bi_end_io which may in 
> turn
> call bio_endio again. When this recursion happens, put the new bio to the 
> queue
> and process it later, from the top-level bio_endio.

Thanks that looks good, I'll throw this into the testing mix and see
what happens.

>  void bio_pair_release(struct bio_pair *bp)
> Index: linux-2.6.26-rc8-devel/include/linux/bio.h
> ===================================================================
> --- linux-2.6.26-rc8-devel.orig/include/linux/bio.h	2008-07-02 
> 04:20:43.000000000 +0200
> +++ linux-2.6.26-rc8-devel/include/linux/bio.h	2008-07-02 
> 04:21:16.000000000 +0200
> @@ -86,6 +86,9 @@ struct bio {
> 
>  	/* Number of segments in this BIO after
>  	 * physical address coalescing is performed.
> +	 *
> +	 * When ending a bio request in bio_endio, this field is temporarily
> +	 * (ab)used to keep the error code.
>  	 */
>  	unsigned short		bi_phys_segments;

I'll drop this comment, we don't really need it. Users must use
bio_phys_segments() to retrieve the segment count anyway, in which case
the count is ALWAYS valid.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ