lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20081230113514.fe09d495.akpm@linux-foundation.org>
Date:	Tue, 30 Dec 2008 11:35:14 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Jens Axboe <jens.axboe@...cle.com>
Cc:	kmannth@...ibm.com, linux-kernel@...r.kernel.org,
	sekharan@...ibm.com
Subject: Re: [Patch][RFC] Supress Buffer I/O errors when SCSI REQ_QUIET flag
 set

On Tue, 25 Nov 2008 10:19:18 +0100
Jens Axboe <jens.axboe@...cle.com> wrote:

> On Mon, Nov 24 2008, Keith Mannthey wrote:
> > Allow the scsi request REQ_QUIET flag to be propagated to the buffer
> > file system layer. The basic ideas is to pass the flag from the scsi
> > request to the bio (block IO) and then to the buffer layer.  The buffer
> > layer can then suppress needless printks. 
> > 
> > This patch declutters the kernel log by removed the 40-50 (per lun)
> > buffer io error messages seen during a boot in my multipath setup . It
> > is a good chance any real errors will be missed in the "noise" it the
> > logs without this patch. 
> > 
> > During boot I see blocks of messages like 
> > "
> > __ratelimit: 211 callbacks suppressed
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242847
> > Buffer I/O error on device sdm, logical block 1
> > Buffer I/O error on device sdm, logical block 5242878
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242879
> > Buffer I/O error on device sdm, logical block 5242872
> > "
> > in my logs. 
> > 
> > My disk environment is multipath fiber channel using the SCSI_DH_RDAC
> > code and multipathd.  This topology includes an "active" and "ghost"
> > path for each lun. IO's to the "ghost" path will never complete and the
> > SCSI layer, via the scsi device handler rdac code, quick returns the IOs
> > to theses paths and sets the REQ_QUIET scsi flag to suppress the scsi
> > layer messages. 
> > 
> >  I am wanting to extend the QUIET behavior to include the buffer file
> > system layer to deal with these errors as well. I have been running this
> > patch for a while now on several boxes without issue.  A few runs of
> > bonnie++ show no noticeable difference in performance in my setup. 
> > 
> > Thanks for John Stultz for the quiet_error finalization. 
> 
> Looks good to me. I'll merge it up for 2.6.29.

So a month later this turns up in linux-next.  During the merge
window, giving a nice pile of rejects to keep me amused.

Can we do better than this, please?  A lot?

> +static int quiet_error(struct buffer_head *bh)
> +{
> +       if (!test_bit(BH_Quiet, &bh->b_state) && printk_ratelimit())
> +               return 0;
> +       return 1;
> +}
> 

For better of for worse, we have a convention of using cpp-generated
helper functions for the buffer_head flags.  There's no reason why this
new code needs to diverge from that.  The above should use buffer_quiet().

The functions in fs/buffer.c have been nicely commented.

This function is poorly named.  What does "quiet_error" *mean*?

<tries to work it out>

Every caller of this function does `if (!quiet_error(bh))'.  Would it
not make more sense to invert the sense of its return value?

static int permit_bh_errors(struct buffer_head *bh)
{
	if (buffer_quiet(bh))
		return 0;	/* IO layer suppressed error messages */
	return printk_ratelimit();
}

Did I translate that right?  If so, then the addition of the
printk_ratelimit() to the non-buffer_quiet() buffers is an
unchangelogged and unrelated alteration.

The use of printk_ratelimit() needs some thought.  It shares
ratelimiting state with all other printk_ratelimit() callsites.  Was
that desirable?  Would it have been better to create a private
ratelimit_state for buffer_heads?  Per physical device?  Per
something-else?


> +       if (unlikely (test_bit(BIO_QUIET,&bio->bi_flags)))
> +               set_bit(BH_Quiet, &bh->b_state);

And the above (which has coding-style errors and has apparently not been
checkpatched) should use set_buffer_quiet().


> --- a/include/linux/buffer_head.h
> +++ b/include/linux/buffer_head.h
> @@ -35,6 +35,7 @@ enum bh_state_bits {
>         BH_Ordered,     /* ordered write */
>         BH_Eopnotsupp,  /* operation not supported (barrier) */
>         BH_Unwritten,   /* Buffer is allocated on disk but not written */
> +       BH_Quiet,       /* Buffer Error Prinks to be quiet */
>  
>         BH_PrivateStart,/* not a state bit, but the first bit available
>                          * for private allocation by other entities

Add

+ BUFFER_FNS(Quiet, quiet)

around line 123 to generate the helper functions.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ