linux-kernel - RE: help! locks problem in block layer request queue?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <38D9F46DFF92C54980D2F2C1E8EE3130248428D8@pdsmsx503.ccr.corp.intel.com>
Date:	Sat, 21 Feb 2009 17:24:57 +0800
From:	"Gao, Yunpeng" <yunpeng.gao@...el.com>
To:	Jens Axboe <jens.axboe@...cle.com>
CC:	"linux-ide@...r.kernel.org" <linux-ide@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: help! locks problem in block layer request queue?

Really awesome!  This is a big bug. I have re-write the code of processing requests from the request queue. The new code is copied from drivers/mtd/mtd_blkdevs.c and did some necessary modifies. Now it works well.  Many thanks to you :)

BTW, I noticed that MTD driver (drivers/mtd/mtd_blkdevs.c) and MMC driver (drivers/mmc/card/block.c and queue.c) also register a block device, and they create a kernel thread to process the request queue instead of process it directly. Why they do it like that? Is there any special reason for that?

Thanks a lot.

Rgds,
Yunpeng Gao 

-----Original Message-----
From: Jens Axboe [mailto:jens.axboe@...cle.com] 
Sent: 2009年2月19日 21:13
To: Gao, Yunpeng
Cc: linux-ide@...r.kernel.org; linux-kernel@...r.kernel.org
Subject: Re: help! locks problem in block layer request queue?

On Thu, Feb 19 2009, Gao, Yunpeng wrote:
> 
> Hi all,
> 
> Sorry for the too long email. But I encountered a kernle OOP problem
> when testing my standalone NAND block driver (it's almost a normal
> block device driver) and not sure why this happen.
> 
> In my development environment, the linux 2.6.27 kernel boot with
> initrd, then 'chroot' to an MMC card. After chroot, I try to mkfs.ext3
> on NAND device. but it caused the kernel OOP message.  If I mkfs.ext3
> on NAND device before chroot, then it works well (it can mount/umount,
> copy file correctly accross system reboot).
> 
> Below is the log message (/dev/mmcblk0 is the MMC card device node,
> and /dev/nda is the NAND flash device node) and part of the driver
> code.
> 
> From the OOP message, It seems there's improper usage of locks in my
> driver code, but actually, there only one spinlock used in the driver
> (spinlock_t qlock defined in struct spectra_nand_dev). And it only
> used by registered request queue. Also, I used a semaphore
> ('spectra_sem') to prevent the low layer function from being
> re-entered. As the low layer (hardware layer) now works in PIO mode
> and it's very slowly, so maybe it holds the spinlock or semaphore for
> too long time?

You call the bvec_kmap_irq() and then call a function that does a
down(). This is illegal, as you cannot block/schedule with interrupts
disabled.

-- 
Jens Axboe