[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250815142258.GA377110@bhelgaas>
Date: Fri, 15 Aug 2025 09:22:58 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: "He, Rui" <Rui.He@...driver.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Chikhalkar, Prashant" <Prashant.Chikhalkar@...driver.com>,
"Xiao, Jiguang" <Jiguang.Xiao@...driver.com>
Subject: Re: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus()
On Fri, Aug 15, 2025 at 02:31:31AM +0000, He, Rui wrote:
> > -----Original Message-----
> > From: Bjorn Helgaas <helgaas@...nel.org>
> > Sent: 2025年8月15日 4:36
> > To: He, Rui <Rui.He@...driver.com>
> > Cc: Bjorn Helgaas <bhelgaas@...gle.com>; linux-pci@...r.kernel.org;
> > linux-kernel@...r.kernel.org; Chikhalkar, Prashant
> > <Prashant.Chikhalkar@...driver.com>; Xiao, Jiguang
> > <Jiguang.Xiao@...driver.com>
> > Subject: Re: [PATCH 1/1] pci: Add subordinate check before
> > pci_add_new_bus()
> >
> > CAUTION: This email comes from a non Wind River email account!
> > Do not click links or open attachments unless you recognize the sender and
> > know the content is safe.
> >
> > On Thu, Aug 14, 2025 at 05:39:37PM +0800, Rui He wrote:
> > > For preconfigured PCI bridge, child bus created on the first scan.
> > > While for some reasons(e.g register mutation), the secondary, and
> > > subordiante register reset to 0 on the second scan, which caused to
> > > create PCI bus twice for the same PCI device.
> >
> > I don't quite follow this. Do you mean something is changing the
> > bridge configuration between the first and second scans?
>
> I'm not sure what changed the bridge configuration, but the
> secondary and subordinate is indeed 0 on the second scan as [bus
> 0e-10] created for 0000:0b:01.0.
>
> In my opinion, it might be an invalid communication or register
> mutation in PCI bridge.
> > > Following is the related log:
> > > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus
> > > 0d] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge
> > > configuration invalid ([bus 00-00]), reconfiguring [Wed May 28
> > > 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] [Wed
> > > May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10]
> > > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e.
> >
> > It looks like the [bus 0f-10] range is assigned to both bridges
> > (0b:01.0 and 0b:05.0), which would definitely be a problem.
> >
> > I'm surprised that we haven't tripped over this before, and I'm
> > curious about how we got here. Can you set
> > CONFIG_DYNAMIC_DEBUG=y, boot with the dyndbg="file drivers/pci/*
> > +p" kernel parameter, and collect the complete dmesg log?
>
> Sorry, as this is a individual issue, and cannot be reproduced, I
> cannot offer more detailed logs.
Do you have the complete dmesg log from this one time you saw the
problem?
As-is, I don't think there's quite enough here to move forward with
this. I think we need some more detailed analysis to figure out how
this happens.
Bjorn
Powered by blists - more mailing lists