linux-kernel - Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120501182011.GS11802@beardog.cce.hp.com>
Date:	Tue, 1 May 2012 13:20:11 -0500
From:	scameron@...rdog.cce.hp.com
To:	james.bottomley@...senpartnership.com, linux-scsi@...r.kernel.org,
	linux-kernel@...r.kernel.org, stephenmcameron@...il.com,
	thenzl@...hat.com, akpm@...ux-foundation.org,
	mikem@...rdog.cce.hp.com
Cc:	scameron@...rdog.cce.hp.com
Subject: Re: [PATCH 07/17] hpsa: do not give up retry of driver cmds after only 3 retries

On Tue, May 01, 2012 at 07:26:34PM +0200, Andi Shyti wrote:
> Hi,
> 
> > --- a/drivers/scsi/hpsa.c
> > +++ b/drivers/scsi/hpsa.c
> > @@ -1380,17 +1380,24 @@ static void hpsa_scsi_do_simple_cmd_core_if_no_lockup(struct ctlr_info *h,
> >  	}
> >  }
> >  
> > +#define MAX_DRIVER_CMD_RETRIES 25
> >  static void hpsa_scsi_do_simple_cmd_with_retry(struct ctlr_info *h,
> >  	struct CommandList *c, int data_direction)
> >  {
> > -	int retry_count = 0;
> > +	int backoff_time = 10, retry_count = 0;
> >  
> >  	do {
> >  		memset(c->err_info, 0, sizeof(*c->err_info));
> >  		hpsa_scsi_do_simple_cmd_core(h, c);
> >  		retry_count++;
> > +		if (retry_count > 3) {
> > +			msleep(backoff_time);
> 
> for 10ms isn't it better to avoid using msleep?

[...] 
> > +			if (backoff_time < 1000)
> > +				backoff_time *= 2;

Eh, maybe.  from Documentation/timers-howto.txt

                        msleep(1~20) may not do what the caller intends, and
                        will often sleep longer (~20 ms actual sleep for any
                        value given in the 1~20ms range). In many cases this
                        is not the desired behavior.

Sleeping longer (~20ms instead of 10ms) in this instance is fine, as I don't
really care too much exactly how long it sleeps, and it backs off to up to
1280ms eventually anyway.  The idea is, "wait a bit, and retry, and then if
that doesn't work, wait twice as long, and retry, etc."  *exactly* how long
"a bit" is is not super important.  I could change the initial back_off time
to 20 or 30 to satisfy the letter of the advice in Documentation/timers-howto.txt,
if doing so is important.

This is kind of a corner case of a corner case, I don't expect
things will ordinarily end up waiting that long, because normally
one of the 1st 3 retries will succeed.  I just wanted to make it 
a little more robust and not just give up immediately if the 3
initial retries don't succeed, the specific number of retries,
wait times, etc, I just made up.  It still does eventually give up
though, and then probably doesn't do anything good after that
(same as current behavior, just somewhat less likely to get to
that point.)  I'm not actually aware of any complaints of the 
retries failing though (apart from the complaint that prompted
the patch prior to this one, that we didn't retry on getting
SAM_STAT_BUSY.)

-- steve

> > +		}
> >  	} while ((check_for_unit_attention(h, c) ||
> > -			check_for_busy(h, c)) && retry_count <= 3);
> > +			check_for_busy(h, c)) &&
> > +			retry_count <= MAX_DRIVER_CMD_RETRIES);
> >  	hpsa_pci_unmap(h->pdev, c, 1, data_direction);
> >  }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/