lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080130.202659.104027826.k-ueda@ct.jp.nec.com>
Date:	Wed, 30 Jan 2008 20:26:59 -0500 (EST)
From:	Kiyoshi Ueda <k-ueda@...jp.nec.com>
To:	bzolnier@...il.com, rdreier@...co.com, bbpetkov@...oo.de
Cc:	nai.xia@...il.com, flo@...822.org, linux-kernel@...r.kernel.org,
	jens.axboe@...cle.com, j-nomura@...jp.nec.com,
	k-ueda@...jp.nec.com, linux-ide@...r.kernel.org
Subject: Re: kernel BUG at ide-cd.c:1726 in 2.6.24-03863-g0ba6c33 &&
 -g8561b089

Hi Roland, Borislav, Bart,

Added linux-ide ML, since we may be able to get helps from other
ide experts.  This thread started from:
    http://lkml.org/lkml/2008/1/29/140

On Tue, 29 Jan 2008 18:23:56 -0500 (EST), Kiyoshi Ueda wrote:
> Hi Bart, 
> 
> On Tue, 29 Jan 2008 14:22:53 -0800, Roland Dreier wrote:
> > Hi, I saw the same BUG from ide-cd on one of my systems.  I applied
> > the debugging patch to replace the BUG with blk_dump_rq_flags(), and I
> > got the output below (full boot log and .config attached to this
> > email).
> > 
> > Please let me know if there's anything else that would help debug the
> > problem.
> 
> Thank you for the information, Roland.
> 
>  
> > [    4.072271] Uniform CD-ROM driver Revision: 3.20
> > [    4.098236] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [    4.100269]
> > [    4.100269] sector 1949759, nr/cnr 0/0
> > [    4.100269] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d8
> > [    4.100269] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> > [    4.101005] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [    4.104269]
> > [    4.104269] sector 1949759, nr/cnr 0/0
> > [    4.104269] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d2
> > [    4.104269] cdb: 12 00 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> > [    4.109203] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [    4.112270]
> > [    4.112270] sector 1949759, nr/cnr 0/0
> > [    4.112270] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d8
> > [    4.112270] cdb: 12 01 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> > [    4.112945] ide-cd: rq still having bio: dev hda: type=2, flags=114c8
> > [    4.116270]
> > [    4.116270] sector 1949759, nr/cnr 0/0
> > [    4.116270] bio ffff8102418cc600, biotail ffff8102418cc600, buffer 0000000000000000, d2
> > [    4.116270] cdb: 12 01 00 00 fe 00 00 00 00 00 00 00 00 00 00 00
> 
> Bart,
> This means that the rq still has a bio even after DRQ_STAT is cleared.
> The original ide-cd code was calling only end_that_request_last() there.
> So I thought that the rq should have no bio when DRQ_STAT is cleared,
> otherwise the bio leaks.
> 
> Was my understanding wrong and is that correct behavior in ide-cd?

I borrowed a box having the same nForce chipset and tried sg_inq
command to submit the GPCMD_INQUIRY ("cdb: 12" of the debug message).
I confirmed that ide-cd run through the code path (DRQ_STAT == 0)
by the same debug patch, but requests always don't have bio there
on my test box.  So I can't reproduce the problem yet.
-----------------------------------------------------------------------
ide-cd: rq: dev hda: type=2, flags=114c8

sector 37958141, nr/cnr 0/0
bio 00000000, biotail f78e4980, buffer 00000000, data 00000000, len 0
cdb: 12 00 00 00 24 00 00 00 00 00 00 00 00 00 00 00
-----------------------------------------------------------------------


The original code was calling only end_that_request_last() here,
but no problem happened.
This may mean that the upper layer can handle the rq correctly,
no matter whether the rq still has a bio or not.
If so, we should be able to unlink the bio by calling
end_that_request_chunk() with remaining data size.



Roland,
Could you try the patch below and give me all boot messages again?

This patch displays debug messages against requests still having bio,
then tries to unlink all bios from the rq before the rq is completed.
So your system may be able to continue to work correctly
after displaying debug messages.
I'd like to see the debug messages and know whether your system
still gets the problem or not.

--- a/drivers/ide/ide-cd.c	2008-01-30 18:24:51.000000000 -0500
+++ b/drivers/ide/ide-cd.c	2008-01-30 18:24:33.000000000 -0500
@@ -1722,8 +1722,18 @@ static ide_startstop_t cdrom_newpc_intr(
 	 */
 	if ((stat & DRQ_STAT) == 0) {
 		spin_lock_irqsave(&ide_lock, flags);
-		if (__blk_end_request(rq, 0, 0))
-			BUG();
+		if (__blk_end_request(rq, 0, 0)) {
+			blk_dump_rq_flags(rq, "ide-cd: rq still having bio");
+			printk("backup: data_len=%u  bi_size=%u\n",
+				rq->data_len, rq->bio->bi_size);
+
+			if (__blk_end_request(rq, 0, rq->data_len)) {
+				blk_dump_rq_flags(rq, "ide-cd: BAD rq");
+				printk("backup: data_len=%u  bi_size=%u\n",
+					rq->data_len, rq->bio->bi_size);
+				BUG();
+			}
+		}
 		HWGROUP(drive)->rq = NULL;
 		spin_unlock_irqrestore(&ide_lock, flags);
 
Thanks,
Kiyoshi Ueda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ