lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 02 Dec 2008 13:18:36 +0900
From:	Tejun Heo <tj@...nel.org>
To:	"Nicholas A. Bellinger" <nab@...ux-iscsi.org>
CC:	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	Mike Anderson <andmike@...ux.vnet.ibm.com>,
	Mike Christie <michaelc@...wisc.edu>,
	Christoph Hellwig <hch@....de>,
	James Bottomley <James.Bottomley@...senPartnership.com>,
	Andrew Morton <akpm@...l.org>,
	Alan Stern <stern@...land.harvard.edu>,
	Hannes Reinecke <hare@...e.de>,
	Boaz Harrosh <bharrosh@...asas.com>,
	Jens Axboe <jens.axboe@...cle.com>,
	linux-scsi <linux-scsi@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	"Linux-iSCSI.org Target Dev" 
	<linux-iscsi-target-dev@...glegroups.com>
Subject: Re: Changes to Linux/SCSI target mode infrastructure for v2.6.28

Nicholas A. Bellinger wrote:
>>> So far during my initial testing, I am running into a two different
>>> exceptions.  One NULL pointer deference OOPS after half dozen Open/iSCSI
>>> login/logouts in block/elevator.c:elv_dequeue_request().   Here is the
>>> trace from SCSI softirq context:
>>>
>>> http://linux-iscsi.org/builds/user/nab/2.6.28-rc6-oops-0.png
>>> http://linux-iscsi.org/builds/user/nab/2.6.28-rc6-oops-1.png

Can you build with debug info and find out which line is the offending
one?

>>> The other one is a BUG_ON in blk/blk-timeout.c:177 in blk_add_timeout()
>>> that happens after a few hundred MB of READ_10 traffic, which also
>>> appears to pass through elv_dequeue_request() at some point:
>>>
>>> http://linux-iscsi.org/builds/user/nab/2.6.28-rc6-oops-2.png
>>> http://linux-iscsi.org/builds/user/nab/2.6.28-rc6-oops-4.png

Hmmm... this means blk_add_timer() is being called after the request
is already completed.  All the problem discovered till now have to do
with timeout going off without the low level driver knowing about the
request.  I don't have much idea and it'll probably be best to trace
what's going on using blktrace or printks.  Maybe this is caused by
list corruption as with the first issue or request completion races
with requeueing?

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ