[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090604202900.57f31d6a@mjolnir.ossman.eu>
Date: Thu, 4 Jun 2009 20:29:00 +0200
From: Pierre Ossman <pierre@...man.eu>
To: Stefan Bader <stefan.bader@...onical.com>,
Jens Axboe <axboe@...nel.dk>
Cc: linux-kernel@...r.kernel.org, Andy Whitcroft <apw@...onical.com>
Subject: Re: [PATCH] mmc: prevent dangling block device from accessing stale
queues
On Thu, 04 Jun 2009 20:00:52 +0200
Stefan Bader <stefan.bader@...onical.com> wrote:
> Kernel: 2.6.30-rc7 based
> Worked in 2.6.28 (probably only because things went at a different speed)
>
> Testcase: Use ext3/ext4 on a SD card partitioned with one primary DOS partition
> and leave it mounted while suspend/resume.
>
> Result: After resume the partition table of the SD card has been erased.
>
> The detailed description can be found at:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/383668
>
> In essence the mmc block device frees the generic request queue before the last
> user of the gendisk has stopped using it leaving an invalid queue pointer which
> get unfortunately re-used before more requests come in for the old device.
>
> The bugfix will cause more I/O error messages and might not be the ultimate way
> things should work, but it prevents data from getting lost.
>
You seem to have dug a bit further than I've had time for. Do you have
anything substantial to back this up:
> + /*
> + * Calling blk_cleanup_queue() would be too soon here. As long as
> + * the gendisk has a reference to it and is not released we should
> + * keep the queue. It has been shutdown and will not accept any new
> + * requests, so that should be safe.
> + */
?
It would seem that gendisk is making some bad assumptions and needs to
be changed if that is the case.
This part from the launchpad report also seems incredibly broken:
> What makes the whole thing a disaster is the fact that the block device queue objects are taken from a slub cache. Which means on resume, the newly created block device will get the same queue object as the old one, initializes it and
> after the tasks have been resumed, ext3 feels obliged to write out the invalidated superblocks (still not sure why it goes for sector 0) which will happily migrate to the new block device and cause confusion.
Jens, comments?
Rgds
--
-- Pierre Ossman
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists