lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Aug 2018 15:07:47 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Mauro Carvalho Chehab <mchehab@...nel.org>,
        Borislav Petkov <bp@...en8.de>
Cc:     linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        "Lee, Chun-Yi" <joeyli.kernel@...il.com>,
        Tony Luck <tony.luck@...el.com>
Subject: sb_edac.c lacks PCI domain support?

I think sb_edac.c (and probably other EDAC stuff) lacks PCI domain
support.  I notice messages like this:

  [   14.370256] pci 0000:ff:13.5: [8086:6fad] type 00 class 0x088000
  [   14.980481] pci 0000:bf:13.5: [8086:6fad] type 00 class 0x088000
  [   15.590646] pci 0000:7f:13.5: [8086:6fad] type 00 class 0x088000
  [   16.200498] pci 0000:3f:13.5: [8086:6fad] type 00 class 0x088000
  [   17.928243] pci 0001:ff:13.5: [8086:6fad] type 00 class 0x088000
  [   18.538876] pci 0001:bf:13.5: [8086:6fad] type 00 class 0x088000
  [   19.149211] pci 0001:7f:13.5: [8086:6fad] type 00 class 0x088000
  [   19.759431] pci 0001:3f:13.5: [8086:6fad] type 00 class 0x088000
  ...
  [   54.298058] EDAC sbridge: Duplicated device for 8086:6fad
  [   54.298062] EDAC sbridge: Failed to register device with error -19.

on a large system (see [1]).  It looks like sbridge_get_onedevice()
looks up things based on the PCI bus number, but it ignores the PCI
domain (aka segment) number, and I suspect it thinks 0000:ff:13.5 and
0001:ff:13.5 are duplicates.

  sbridge_get_all_devices
    while (...)
      do
	sbridge_get_onedevice
	  pdev = pci_get_device(...)
	  sbridge_dev = get_sbridge_dev(pdev->bus->number, ...)
	  if (sbridge_dev->pdev[sbridge_dev->i_devs])
	    printk("Duplicated device ...")
	    return -ENODEV                   # -19
      while (pdev ...)

It looks like 88ae80aa609c ("EDAC, skx_edac: Handle systems with
segmented PCI busses") fixes a similar problem; maybe that should
be applied elsewhere in EDAC as well?

Why doesn't EDAC use the standard pci_register_driver() interface?
That would avoid issues like this.  It would also avoid the potential
conflict of another driver operating on the device at the same time.

[1] https://bugzilla.kernel.org/attachment.cgi?id=277759 (attachment
to unrelated bug https://bugzilla.kernel.org/show_bug.cgi?id=200765)

Powered by blists - more mailing lists