lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0806090858310.3473@woody.linux-foundation.org>
Date:	Mon, 9 Jun 2008 09:15:40 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Cornelia Huck <cornelia.huck@...ibm.com>,
	Vegard Nossum <vegard.nossum@...il.com>,
	Adrian Bunk <bunk@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Jens Axboe <jens.axboe@...cle.com>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	"Rafael J. Wysocki" <rjw@...k.pl>,
	Kay Sievers <kay.sievers@...y.org>, Neil Brown <neilb@...e.de>,
	Mariusz Kozlowski <m.kozlowski@...land.pl>,
	Dave Young <hidave.darkstar@...il.com>
Subject: Re: [bug, 2.6.26-rc4/rc5] sporadic bootup crashes in
 blk_lookup_devt()/prepare_namespace()



On Mon, 9 Jun 2008, Ingo Molnar wrote:
> 
> ah. I suspect that explains the sporadic nature as well: normally there 
> is 'some' object at the list address, just with an invalid type.

Yes. It could cause two kinds of problems:

 - it might end up returning the wrong 'dev_t'. This is unlikely, since we 
   only have two cases: the working whole-disk case, and the case where we 
   find a partition.

   But if we find a partition, we'd still get the right dev_t *most* of 
   the time, because we'd first get called with "part=0", and then we have 

	if (part < disk->minors)
		devt = MKDEV(MAJOR(dev->devt),
			MINOR(dev->devt) + part);
	break;

   where we would only fail if that conditional statement would be untrue 
   (and then we'd incorrectly return MKDEV(0,0)). Otherwise, 'devt' ends 
   up being correct anyway.

   So one effect of this bug would be that it would use the random 
   "disk->minors" value to either return the right devt, or return one 
   that is all zeroes. But if we return the all-zeroes case, then 
   init/do_mounts.c will just try again, this time with the numbers 
   removed, and now it wouldn't hit the "strcmp()" on any partition, and 
   the next time around it would find a disk and work again.

   So this is a bug, but it's one that essentially is hidden by the 
   caller.

 - The other alternative is that the bogus "disk->minors" thing would 
   cause a page fault. This would only happen if the partition allocation 
   was the first thing in a page, and the previous page was unused, and 
   you had DEBUG_PAGEALLOC enabled.

   This is obviously the case you saw.

My trivial fix makes it ignore partitions entirely.

We *could* (and perhaps should) do something slightly more involved 
instead, which actually uses a partition if it's there). Like this. That 
would avoid my one nagging worry (that some clever usage makes partitions 
with a different numbering or without a base block device).

And this is all still ignoring the locking issue, of course. It would be 
trivial to just remove the block_class_lock, and change

	mutex_[un]lock(&block_class_lock);

into

	down|up(&block_class.sem);

except for _one_ case, which is

	bdev_map = kobj_map_init(base_probe, &block_class_lock);

which really wants a mutex, not a sempahore.

So to fix that, we'd need to make the class->sem be a mutex, and pass that 
in. Which is probably a good change too, but makes the whole thing much 
bigger.

		Linus

---
 block/genhd.c |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 129ad93..101530e 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -661,11 +661,15 @@ dev_t blk_lookup_devt(const char *name, int part)
 	mutex_lock(&block_class_lock);
 	list_for_each_entry(dev, &block_class.devices, node) {
 		if (strcmp(dev->bus_id, name) == 0) {
-			struct gendisk *disk = dev_to_disk(dev);
-
-			if (part < disk->minors)
-				devt = MKDEV(MAJOR(dev->devt),
-					     MINOR(dev->devt) + part);
+			if (dev->type == &disk_type) {
+				struct gendisk *disk = dev_to_disk(dev);
+				if (part >= disk->minors)
+					continue;
+			} else {
+				if (part)
+					continue;
+			}
+			devt = dev->devt + part;
 			break;
 		}
 	}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ