linux-kernel - Re: [PATCH 1/2] mmc_block: print better data error message after timeout

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081026121155.69692474@mjolnir.drzeus.cx>
Date:	Sun, 26 Oct 2008 12:11:55 +0100
From:	Pierre Ossman <drzeus@...eus.cx>
To:	Adrian Hunter <ext-adrian.hunter@...ia.com>
Cc:	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] mmc_block: print better data error message after
 timeout

On Thu, 16 Oct 2008 16:26:55 +0300
Adrian Hunter <ext-adrian.hunter@...ia.com> wrote:

> In particular, if the card gets an ECC error it will
> timeout, in which case it is much more helpful to see
> an ECC error rather than a timeout error.
> 
> Signed-off-by: Adrian Hunter <ext-adrian.hunter@...ia.com>
> ---

Woo. I think you're the first I've seen that has been able to trigger
an actual card error. :)

As for the patch, I like the idea but I'm not entirely satisfied with
the implementation.

> +static void print_data_error(struct mmc_blk_request *brq, struct mmc_card *card,
> +			     struct request *req)
> +{
> +	struct mmc_command cmd;
> +	char *emsg;
> +	u32 status;
> +	int status_err = 0;
> +
> +	if (brq->data.error != -ETIMEDOUT || mmc_host_is_spi(card->host))
> +		goto out_print;
> +

The error codes are more of a hint than anything else, so you should
check the status for all non-zero codes. You should also not just check
data.error, but all of them.

And why exclude spi?

> +	if (brq->mrq.stop)
> +		/* 'Stop' response contains card status */
> +		status = brq->mrq.stop->resp[0];
> +	else {
> +		cmd.opcode = MMC_SEND_STATUS;
> +		cmd.arg = card->rca << 16;
> +		cmd.flags = MMC_RSP_R1 | MMC_CMD_AC;
> +		status_err = mmc_wait_for_cmd(card->host, &cmd, 0);
> +		if (status_err)
> +			goto out_print;
> +		status = cmd.resp[0];
> +	}

Errors can occur on writes as well, so I suggest accumulating the
status bits instead of trying to get the entire set in one go. I.e.:

status = 0;
if (mrq.stop)
	status |= mrq.stop.resp[0];
while (card not ready)
	status |= send_status();

IOW, you should do this in the main handler (that already has the
status loop for writes).

> +
> +	emsg = (status & R1_CARD_ECC_FAILED) ? "ECC" : "I/O";
> +

There are also some other error codes that can be of interest.
"Internal controller error" for example.

(it's probably also better to state "Unknown" error and not "I/O" for
the fallback)

> @@ -281,10 +322,8 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
>  			       req->rq_disk->disk_name, brq.cmd.error);
>  		}
>  
> -		if (brq.data.error) {
> -			printk(KERN_ERR "%s: error %d transferring data\n",
> -			       req->rq_disk->disk_name, brq.data.error);
> -		}
> +		if (brq.data.error)
> +			print_data_error(&brq, card, req);
>  

Please keep the old message and add this as a new extra piece of
information. It is helpful for debugging to see both what the driver
and the card reported.

Rgds
-- 
     -- Pierre Ossman

  WARNING: This correspondence is being monitored by the
  Swedish government. Make sure your server uses encryption
  for SMTP traffic and consider using PGP for end-to-end
  encryption.

Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)