lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Tue, 4 Jun 2013 00:21:54 +0000
From:	KY Srinivasan <kys@...rosoft.com>
To:	James Bottomley <jbottomley@...allels.com>
CC:	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
	"ohering@...e.com" <ohering@...e.com>,
	"hch@...radead.org" <hch@...radead.org>,
	"linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>
Subject: RE: [PATCH 1/5] Drivers: scsi: storvsc: Make the scsi timeout a
 module parameter



> -----Original Message-----
> From: James Bottomley [mailto:jbottomley@...allels.com]
> Sent: Monday, June 03, 2013 7:47 PM
> To: KY Srinivasan
> Cc: gregkh@...uxfoundation.org; linux-kernel@...r.kernel.org;
> devel@...uxdriverproject.org; ohering@...e.com; hch@...radead.org; linux-
> scsi@...r.kernel.org
> Subject: Re: [PATCH 1/5] Drivers: scsi: storvsc: Make the scsi timeout a module
> parameter
> 
> On Mon, 2013-06-03 at 23:25 +0000, KY Srinivasan wrote:
> >
> > > -----Original Message-----
> > > From: James Bottomley [mailto:jbottomley@...allels.com]
> > > Sent: Monday, June 03, 2013 7:03 PM
> > > To: KY Srinivasan
> > > Cc: gregkh@...uxfoundation.org; linux-kernel@...r.kernel.org;
> > > devel@...uxdriverproject.org; ohering@...e.com; hch@...radead.org; linux-
> > > scsi@...r.kernel.org
> > > Subject: Re: [PATCH 1/5] Drivers: scsi: storvsc: Make the scsi timeout a
> module
> > > parameter
> > >
> > > On Mon, 2013-06-03 at 16:21 -0700, K. Y. Srinivasan wrote:
> > > > The standard scsi timeout is not appropriate in some of the environments
> > > where
> > > > Hyper-V is deployed. Set this timeout appropriately for all devices managed
> > > > by this driver. Further make this a module parameter.
> > > >
> > > > Signed-off-by: K. Y. Srinivasan <kys@...rosoft.com>
> > > > Reviewed-by: Haiyang Zhang <haiyangz@...rosoft.com>
> > > > ---
> > > >  drivers/scsi/storvsc_drv.c |    9 +++++++++
> > > >  1 files changed, 9 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> > > > index 16a3a0c..8d29a95 100644
> > > > --- a/drivers/scsi/storvsc_drv.c
> > > > +++ b/drivers/scsi/storvsc_drv.c
> > > > @@ -221,6 +221,13 @@ static int storvsc_ringbuffer_size = (20 *
> PAGE_SIZE);
> > > >  module_param(storvsc_ringbuffer_size, int, S_IRUGO);
> > > >  MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer size (bytes)");
> > > >
> > > > +/*
> > > > + * Timeout in seconds for all devices managed by this driver.
> > > > + */
> > > > +static int storvsc_timeout = 180;
> > > > +module_param(storvsc_timeout, uint, (S_IRUGO | S_IWUSR));
> > > > +MODULE_PARM_DESC(storvsc_timeout, "Device timeout (seconds)");
> > > > +
> > > >  #define STORVSC_MAX_IO_REQUESTS				128
> > > >
> > > >  /*
> > > > @@ -1204,6 +1211,8 @@ static int storvsc_device_configure(struct
> scsi_device
> > > *sdevice)
> > > >
> > > >  	blk_queue_bounce_limit(sdevice->request_queue, BLK_BOUNCE_ANY);
> > > >
> > > > +	blk_queue_rq_timeout(sdevice->request_queue, (storvsc_timeout *
> > > HZ));
> > >
> > > Why does this need to be a module parameter?  It's already a sysfs one
> > > in the scsi_device class?  Three minutes is also a bit large.  The
> > > default is 30s with huge cache arrays recommending upping this to
> > > 60s ... you're three times this.
> >
> > James,
> > This number was arrived at based on some testing that was done on the
> > cloud. On our cloud, we have a  120 second
> > timeouts that trigger broader VM level recovery  and in cases where
> > there is storage access issues
> > (which is when we would hit this timeout), it will be better to defer
> > to the fabric level recovery than attempt
> > Scsi level recovery/retry.  The default value chosen for devices
> > managed by storvsc should be just fine,
> 
> So are you sure you want to set the command timeout to 3 minutes? ...
> it's an incredibly high value.  The actual complete timeout is this
> value multiplied by the number of retries, which is 5 for disk devices,
> so you'll be waiting up to 15 minutes before we signal a failure in some
> circumstances.  It sounds like you want the actual path length of error
> recovery to be on average 3 minutes.
>
> The value of the timeout should be a compromise between the longest time
> you want the user to wait for a failure and the longest time a device
> should take to respond.

This should be fine. Note that all error recovery/retry is happening on the host side and beyond
a certain delay, we will do a VM level recovery at the fabric level.  On a slightly different note,
we have the same issue with the SCSI FLUSH timeout. Would you consider changing this.
> 
> > I made it a module parameter to have more flexibility.
> 
> It's *already* a sysfs parameter ... why do you want an additional
> module parameter?  Multiple parameters for the same quantity, especially
> ones which can't be altered at runtime like module parameters, end up
> confusing users.

Agreed. I can send you a patch that would remove this parameter. Or, if you prefer
I could resend this set with the change to this patch (removing the module parameter).

Regards,

K. Y
> 
> James
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ