lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081130200558.21f2561b@mjolnir.drzeus.cx>
Date:	Sun, 30 Nov 2008 20:05:58 +0100
From:	Pierre Ossman <drzeus@...eus.cx>
To:	Adrian Hunter <ext-adrian.hunter@...ia.com>
Cc:	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] mmc_block: ensure all sectors that do not have
 errors are read

On Wed, 29 Oct 2008 16:26:07 +0200
Adrian Hunter <ext-adrian.hunter@...ia.com> wrote:

> Pierre Ossman wrote:
> > You also need to adjust the sg list when you change the block count.
> > There was code there that did that previously, but it got removed in
> > 2.6.27-rc1.
> 
> That is not necessary.  It is an optimisation.  In general, optimising an
> error path serves no purpose.
> 

Actually, it is a requirement that some drivers rely on. There is also
a WARN_ON(), or possibly a BUG_ON() in the request handling path. I
think it only triggers with MMC_DEBUG though...

> > 
> > Some concerns here:
> > 
> > 1. "brq.data.blocks > 1" doesn't need to be optimised into its own
> > variable. It just obscures things.
> 
> But you have to assume that no driver changes the 'blocks' variable e.g.
> counts it down.

That would be a no-no. I guess this is not properly documented, but it
is strongly implied that the request structure should be read-only
except for fields that contain result data. ;)

> It is not an optimisation, it is just to improve
> reliability and readability.  What does it obscure?

if-clauses look like they're different, but they're not.

> > 3. You should check all errors, not just data.error and ETIMEDOUT.
> 
> No.  Data timeout is a special case.  The other errors are system errors.
> If there is a command error or stop error (which is also a command error)
> it means either there is a bug in the kernel or the controller or card
> has failed to follow the specification.  Under those circumstances

Controllers do not follow the specs. Welcome to reality. :)

(sad fact: I don't think there is a single controller out there that
follows the specs to 100%).

Anyway, for this specific case, a controller might not be able to
differentiate between command and data timeouts. Or it might report
"data error" for anything unexpected (I seem to recall one of the
currently supported controllers behaves like this).

Generally it's about being as flexible as we can be. This is a world of
crappy hardware, so strict spec adherence just leads to a lot of hurt.
And I have the scars to prove it. :)

> > 4. You should first report the successfully transferred blocks as ok.
> 
> That is another optimisation of the error path i.e. not necessary.  It
> is simpler to just start processing the request again - which the patch
> does.
> 

I suppose.

> > You might also want to print something so that it is
> > visible that the driver retried the transfer.
> 
> There are already two error messages per sector (one from this function
> and one from '__blk_end_request()', so another message is too much.
> 

I didn't mean an error for the bad sector, but something like
"encountered an error during multi-block transfer. retrying...". This
could save a lot of time and headache during debugging in the future.

> @@ -362,12 +380,19 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
>  #endif
>  		}
>  
> -		if (brq.cmd.error || brq.data.error || brq.stop.error)
> +		if (brq.cmd.error || brq.stop.error)
> +			goto cmd_err;
> +
> +		if (brq.data.error == -ETIMEDOUT && rq_data_dir(req) == READ) {
> +			spin_lock_irq(&md->lock);
> +			ret = __blk_end_request(req, -EIO, brq.data.blksz);
> +			spin_unlock_irq(&md->lock);
> +			continue;
> +		}
> +

This could use a comment explaining that if we've reached this point,
we know that it was a single sector that failed.

> +		if (brq.cmd.error)
>  			goto cmd_err;
>  
> -		/*
> -		 * A block was successfully transferred.
> -		 */
>  		spin_lock_irq(&md->lock);
>  		ret = __blk_end_request(req, 0, brq.data.bytes_xfered);
>  		spin_unlock_irq(&md->lock);

Shouldn't this comment stay now? :)

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ