lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac3eb2510810301623n6a7256a1td59d3e38a01add4c@mail.gmail.com>
Date:	Fri, 31 Oct 2008 00:23:20 +0100
From:	"Kay Sievers" <kay.sievers@...y.org>
To:	"Folkert van Heusden" <folkert@...heusden.com>
Cc:	"Peter Zijlstra" <a.p.zijlstra@...llo.nl>,
	linux-kernel@...r.kernel.org
Subject: Re: [2.6.26] kobject_add_internal failed for 2:0 with -EEXIST / unable to handle kernel NULL pointer dereference in sysfs_create_link

On Fri, Oct 31, 2008 at 00:06, Kay Sievers <kay.sievers@...y.org> wrote:
> On Thu, Oct 30, 2008 at 11:55, Folkert van Heusden
> <folkert@...heusden.com> wrote:
>>> >> >> >> > While running my http://vanheusden.com/pyk/ script (which randomly
>>> >> >> >> > inserts and removes modules) I triggered the folllowing oops in a 2.6.26
>>> >> >> >> > kernel on an IBM xSeries 260. This oops (in fact no oops at all) did not
>>> >> >> >> > get triggered in a 2.6.18 kernel on that system.
>>> >> >> >> >
>>> >> >> >> > [   42.507375] FDC 0 is a National Semiconductor PC87306
>>> >> >> >> > [   42.509057] kobject_add_internal failed for 2:0 with -EEXIST, don't try to register things with the same name in the same directory.
>>> >> >> >> > [   42.509291] Pid: 5301, comm: modprobe Not tainted 2.6.26-1-amd64 #1
>>> >> >> >> > [   42.509431]
>>> >> >> >> > [   42.509433] Call Trace:
>>> >> >> >> > [   42.509685]  [<ffffffff8031b031>] kobject_add_internal+0x13f/0x17e
>>> >> > ...
>>> >> >> >> > [   42.511519]  [<ffffffff8027d23b>] bdi_register+0x57/0xb4
>>> >> >> >>
>>> >> >> >> Looks like bdi sees two devices with the same devnum, or didn't
>>> >> >> >> cleanup an old entry. What does: ls -l "/sys/class/bdi/" print?
>>> >> >> >
>>> >> >> > The following:
>>> >> >> > folkert@...iantesthw:~$ ls -l /sys/class/bdi/
>>> >> >> > drwxr-xr-x 3 root root 0 2008-10-28 18:32 2:0
>>> >> >> > drwxr-xr-x 3 root root 0 2008-10-28 18:32 2:1
>>> >> >>
>>> >> >> Oh, you are running the old sysfs layout without symlinks. Care to
>>> >> >> tell where the "device" link in these directories points to?
>>> >> >
>>> >> > None exist:
>>> >> > folkert@...iantesthw:~$ ls -la /sys/class/bdi/*/device
>>> >> > ls: cannot access /sys/class/bdi/*/device: No such file or directory
>>> >>
>>> >> Ah, sorry. Seems the bdi stuff never got to pass the usual parent
>>> >> device with the device registration, to let the bdi device show up at
>>> >> the right place in the device tree.
>>> >>
>>> >> Let's see what current devices on your box have the major 2:
>>> >>   find /sys -name dev | xargs grep '^2:'
>>> >
>>> > /sys/block/fd0/dev:2:0
>>> > /sys/block/fd1/dev:2:1
>>> >
>>> > As my script does modprobe/rmmod in parallel (4 processes) maybe it is a
>>> > conflict of one process doing an modprobe of floppy while the other does
>>> > an rmmod? Or both a modprobe?
>>>
>>> Might be, yes. If you just bootup, and don't run your modprobe/rmmod
>>> script, does the box have 2 floppy devices in /sys too?
>>
>> Yes it does. One physical drive.
>
> Seems that always happens with multiple floppies. I can reproduce it
> here with qemu. It seems not related to modprobing. Also mtd devices
> suffer from the same problem, as bug reports show.
>
> It might be a bug in bdi. Looks like floppies share a single queue,
> the bdi structure lives in the queue. Now we register for every device
> a bdi device, but the queue is shared and the former recorded dev_t in
> the bdi structure is overwritten. At unregistering the bdi device, all
> earlier devices using the same queue are not removed.
>
> Peter, please check, if something like this can happen?

Ok, I get annoyed by these sysfs bugs. :)

Peter, it looks like bdi does not work for devices which share a single queue.
If I add:
  --- a/mm/backing-dev.c
 +++ b/mm/backing-dev.c
  @@ -184,6 +184,8 @@ int bdi_register(struct backing_dev_info *bdi,
struct device *parent,
                  goto exit;
          }

  +       printk("XXXXXXX old bdidev is %p\n", bdi->dev);
  +       printk("XXXXXXX new bdidev is %p\n", dev);
          bdi->dev = dev;
          bdi_debug_register(bdi, dev_name(dev));

I get:
  $ modprobe floppy
  Floppy drive(s): fd0 is 1.44M, fd1 is 1.44M
  FDC 0 is a S82078B
  XXXXXXX old bdidev is 0000000000000000
  XXXXXXX new bdidev is ffff88001f20cd10
  XXXXXXX old bdidev is ffff88001f20cd10
  XXXXXXX new bdidev is ffff88001f20de30

which very much looks like bdi will not remove any earlier registered
device, only the last one, right?

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ