[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4663DE68.5030200@gmail.com>
Date: Mon, 04 Jun 2007 11:42:00 +0200
From: Rene Herman <rene.herman@...il.com>
To: Jens Axboe <axboe@...e.de>
CC: Pekka Enberg <penberg@...helsinki.fi>,
Linux Kernel <linux-kernel@...r.kernel.org>
Subject: [Q] Bio traversal trouble?
Hi Jens, list.
The new Mitsumi legacy CD-ROM driver seems largely done. Both audio and data
work (and the driver's clean) but data does still have a problem that I think
needs some input from a block person. I don't have any idea...
Audio is fine, and data basically as well. It's performing at maximum drive
speed (and gets the correct data):
root@...5:~# dd if=/dev/mitsumi of=/dev/null bs=2k count=8k
8192+0 records in
8192+0 records out
16777216 bytes (17 MB) copied, 111.379 seconds, 151 kB/s
Unfortunately, I can make the box oops by just doing enough of those dd runs in
a row (this one with bs=2k count=128, oopsed the seventh time or something):
===
BUG: unable to handle kernel paging request at virtual address 8c1d2071
printing eip:
c10a6d6f
*pde = 00000000
Oops: 0002 [#1]
Modules linked in: mitsumi nfsd exportfs lockd sunrpc nls_cp437 msdos fat nls_base
CPU: 0
EIP: 0060:[<c10a6d6f>] Not tainted VLI
EFLAGS: 00010246 (2.6.21.3 #1)
EIP is at ioread8_rep+0x20/0x31
eax: 00010000 ebx: 00010300 ecx: 00000800 edx: 00000300
esi: c3145b30 edi: 8c1d2071 ebp: 8c1d2071 esp: c3145aec
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process dd (pid: 1770, ti=c3145000 task=c3114110 task.ti=c3145000)
Stack: 00000000 c3a7ad48 c486504d c4865ab7 00000006 50020034 c1015f7b 00000292
00000000 00520300 00000100 0007e000 00000000 c3340300 c10a0531 00000000
c12adf60 00000000 00000001 00000000 00000000 00000000 dead4ead ffffffff
Call Trace:
[<c486504d>] __mitsumi_get_frame+0xc/0x16 [mitsumi]
[<c4865ab7>] mitsumi_get_frame+0x120/0x134 [mitsumi]
[<c1015f7b>] lock_timer_base+0x15/0x2f
[<c10a0531>] cfq_set_request+0x0/0x144
[<c114fefd>] _spin_unlock_irq+0x20/0x23
[<c1033bda>] mempool_free+0x5f/0x64
[<c4865b2d>] mitsumi_read+0x62/0x94 [mitsumi]
[<c4865ba8>] mitsumi_request+0x49/0xb5 [mitsumi]
[<c1099aae>] blk_start_queueing+0x14/0x1c
[<c10971c2>] elv_insert+0xa3/0x13f
[<c114fbc5>] _spin_lock_irq+0x2f/0x3a
[<c109a577>] __make_request+0x29f/0x2d3
[<c109a75b>] generic_make_request+0x136/0x146
[<c1048433>] kmem_cache_alloc+0x93/0x9e
[<c1033ae0>] mempool_alloc+0x32/0xcd
[<c1033ae0>] mempool_alloc+0x32/0xcd
[<c109a80f>] submit_bio+0xa4/0xaa
[<c106736b>] bio_alloc_bioset+0xb2/0x112
[<c1066dd1>] submit_bh+0xc2/0xdb
[<c1065fc4>] block_read_full_page+0x245/0x252
[<c1068346>] blkdev_get_block+0x0/0x36
[<c114ffff>] _write_unlock_irq+0x20/0x23
[<c103134f>] add_to_page_cache+0x71/0x78
[<c10368f3>] read_pages+0x7c/0xc1
[<c114ff7e>] _read_unlock_irq+0x20/0x23
[<c114ff7e>] _read_unlock_irq+0x20/0x23
[<c1025de0>] trace_hardirqs_on+0x11b/0x13e
[<c1036a25>] __do_page_cache_readahead+0xed/0x107
[<c1036b43>] blockable_page_cache_readahead+0x4d/0x9d
[<c1036c17>] make_ahead_window+0x84/0xa5
[<c1036d74>] page_cache_readahead+0x13c/0x15b
[<c1031cbc>] file_read_actor+0x7f/0x102
[<c1031953>] do_generic_mapping_read+0x110/0x3fa
[<c1031ec6>] generic_file_aio_read+0x187/0x1b0
[<c1031c3d>] file_read_actor+0x0/0x102
[<c104abc3>] do_sync_read+0xbf/0xfc
[<c101f2b8>] autoremove_wake_function+0x0/0x33
[<c1070fe1>] dnotify_parent+0x1b/0x66
[<c104aec2>] vfs_write+0xc9/0xff
[<c1027210>] __lock_release+0x2d/0x3f
[<c104ac87>] vfs_read+0x87/0xfd
[<c104af39>] sys_read+0x41/0x67
[<c1002490>] syscall_call+0x7/0xb
=======================
Code: 00 00 89 c8 ef c3 0f c9 89 0a c3 57 3d ff ff 03 00 53 89 d7 89 c3 89 ca 77
15 66 31 c0 3d 00 00 01 00 74 04 0f 0b eb fe 0f b7 d3 <f3> 6c eb 0a 4a 78 07 8a
03 88 07 47 eb f6 5b 5f c3 57 3d ff ff
EIP: [<c10a6d6f>] ioread8_rep+0x20/0x31 SS:ESP 0068:c3145aec
===
This is obviously not good.
Since the last time this driver was posted it changed considerably and as one of
the chances it's now requesting just one hardware sector ("frame") at a time
from the drive as requesting multiple didn't actually work -- I seemed to have
fouled up earlier tests somehow.
The driver could let the block layer handle just doing one frame at a time by
completing only one frame (4 sectors) each time but this is not a good idea.
Things are somewhat timing sensitive since if you're too late the next sector
has passed below you (and if you're too early you're overrunning the drive)
causing throughput to plummet.
I'd also simply like to understand it, so this is doing a manual bio traversal,
requesting frames from the hardware as it goes along. The driver's main request
function is:
static void mitsumi_request(struct request_queue *q)
{
struct request *req;
while ((req = elv_next_request(q))) {
struct bio *bio;
int sectors = 0;
if (!blk_fs_request(req)) {
end_request(req, 0);
continue;
}
if (rq_data_dir(req) != READ) {
printk(KERN_WARNING
"mitsumi: non-read request to CD\n");
end_request(req, 0);
continue;
}
spin_unlock_irq(q->queue_lock);
rq_for_each_bio(bio, req) {
unsigned int bytes;
bytes = mitsumi_read(req->rq_disk->private_data, bio);
sectors += bytes >> 9;
if (bytes != bio->bi_size)
break;
}
spin_lock_irq(q->queue_lock);
if (!sectors || end_that_request_first(req, 1, sectors)) {
end_request(req, 0);
continue;
}
blkdev_dequeue_request(req);
end_that_request_last(req, 1);
}
}
That is, it's looping around doing one bio at a time, completing the number of
sectors that were done and failing the sector that had an error if any weren't.
(I've had many variants of that end_request stuff in there, none helped with
this problem).
mitsumi_read() is:
static unsigned int mitsumi_read(struct mitsumi_cdrom *mcd, struct bio *bio)
{
sector_t frame = sector_to_frame(bio->bi_sector);
unsigned int bytes = 0;
struct bio_vec *bvec;
int segno;
bio_for_each_segment(bvec, bio, segno) {
unsigned char *dst;
unsigned int len;
dst = page_address(bvec->bv_page) + bvec->bv_offset;
for (len = bvec->bv_len; len; len -= CD_FRAMESIZE) {
if (mitsumi_get_frame(mcd, frame++, dst))
goto out;
bytes += CD_FRAMESIZE;
dst += CD_FRAMESIZE;
}
}
out:
return bytes;
}
bvec->bv_len is guaranteed to be a multiple of the hardware blocksize (the size
set with blk_queue_hardsect_size) meaning that "len" loop is okay, right?
mitsumi_get_frame is the function that requests the frame from the hardware,
possibly sleeps waiting for it to arrive and then reads the frame from I/O
directly to "dst" ([edit] it _is_ legal to use complete() from an interrupt
handler isn't it?)
Is there anything wrong in all this? The thing I could imagine is that it's not
actually legal to drop the queue_lock, but if it's not legal to drop the lock
where this does so now (back in mitsumi_request) it seems it wouldn't be legal
to drop it anywhere since you'd only shorten the window by doing it elsewhere.
If it's illegal period, this would mean this all needs a completely different
setup it seems...
When it faults, it's (generally at least, I believe I saw it fault directly
inside mitsumi_read once as well) at __mitsumi_get_frame, where it's trying to
read the data from the drive to "dst", and "dst" seems corrupted. This in turn
seems to be due to the entire bio being corrupted and the bio_for_each_segment()
looping past the end.
The driver as it stands now is attached but most of it doesn't actually have
anything to do with block, so I hope I've summarised it effectively above. Any
hint you or anyone else reading this would be able to provide would be welcome.
Rene.
View attachment "mitsumi.diff" of type "text/plain" (26144 bytes)
Powered by blists - more mailing lists