[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0fec2db0-fb56-615d-eed4-d702d1bc37fb@broadcom.com>
Date: Mon, 30 Mar 2020 10:04:35 -0700
From: Ray Jui <ray.jui@...adcom.com>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: Srinath Mannam <srinath.mannam@...adcom.com>,
Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Florian Fainelli <f.fainelli@...il.com>,
Ray Jui <rjui@...adcom.com>,
Andrew Murray <andrew.murray@....com>,
bcm-kernel-feedback-list@...adcom.com, linux-pci@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
Bharat Gooty <bharat.gooty@...adcom.com>
Subject: Re: [PATCH 1/3] PCI: iproc: fix out of bound array access
On 3/26/2020 1:48 PM, Bjorn Helgaas wrote:
> On Thu, Mar 26, 2020 at 01:27:36PM -0700, Ray Jui wrote:
>> On 3/26/2020 12:48 PM, Bjorn Helgaas wrote:
>>> ...
>>> It's outside the scope of this patch, but I'm not really a fan of the
>>> pcie->reg_offsets[] scheme this driver uses to deal with these
>>> differences. There usually seems to be *something* that keeps the
>>> driver from referencing registers that don't exist, but it doesn't
>>> seem like the mechanism is very consistent or robust:
>>>
>>> - IPROC_PCIE_LINK_STATUS is implemented by PAXB but not PAXC.
>>> iproc_pcie_check_link() avoids using it if "ep_is_internal", which
>>> is set for PAXC and PAXC_V2. Not an obvious connection.
>>>
>>> - IPROC_PCIE_CLK_CTRL is implemented for PAXB and PAXC_V1, but not
>>> PAXC_V2. iproc_pcie_perst_ctrl() avoids using it ep_is_internal",
>>> so it *doesn't* use it for PAXC_V1, which does implement it.
>>> Maybe a bug, maybe intentional; I can't tell.
>>>
>>> - IPROC_PCIE_INTX_EN is only implemented by PAXB (not PAXC), but
>>> AFAICT, we always call iproc_pcie_enable() and rely on
>>> iproc_pcie_write_reg() silently drop the write to it on PAXC.
>>>
>>> - IPROC_PCIE_OARR0 is implemented by PAXB and PAXB_V2 and used by
>>> iproc_pcie_map_ranges(), which is called if "need_ob_cfg", which
>>> is set if there's a "brcm,pcie-ob" DT property. No clear
>>> connection to PAXB.
>>>
>>> I think it would be more readable if we used a single variant
>>> identifier consistently, e.g., the "pcie->type" already used in
>>> iproc_pcie_msi_steer(), or maybe a set of variant-specific function
>>> pointers as pcie-qcom.c does.
>>
>> It is not possible to use a single variant identifier consistently,
>> i.e., 'pcie->type'. Many of these features are controller revision
>> specific, and certain revisions of the controllers may all have a
>> certain feature, while other revisions of the controllers do not. In
>> addition, there are overlap in features across different controllers.
>>
>> IMO, it makes sense to have feature specific flags or booleans, and have
>> those features enabled or disabled based on 'pcie->type', which is what
>> the current driver does, but like you pointed out, what the driver
>> failed is to do this consistently.
>
> There are several drivers that have the same problem of dealing with
> different revisions of hardware. It would be nice to do it in a
> consistent style, whatever that is.
>
Sure, agree with you that it should be handled in a consistent way
within this driver, and the current driver is not handling this
consistently.
>> The IPROC_PCIE_INTX_EN example you pointed out is a good example. I
>> agree with you that we shouldn't rely on iproc_pcie_write_reg to
>> silently drop the operation for PAXC. We should add code to make it
>> explictly obvious that legacy interrupt is not supported in all PAXC
>> controllers.
>>
>> pcie->pcie->reg_offsets[] scheme was not intended to be used to silently
>> drop register access that are activated based on features. It's a
>> mistake that should be fixed if some code in the driver is done that
>> way, as you pointed out.
>
> That's actually why I dug into this a bit -- the
> iproc_pcie_reg_is_invalid() case is really a design-time error, so it
> seemed like there should be a WARN() there instead of silently
> returning 0 or ignoring a write.
>
I think 'iproc_pcie_reg_is_invalid' is a fall back protection. We should
aim to prevent this from happening in the first place using whatever
means we determined appropriate, and do that consistently. In addition,
I also agree with you that there should be a WARN instead of silently
returning zero (for reads) and dropping the writes.
We'll be looking into improving this as you suggested when we have a
chance. In the mean time, I think both of us agree this is out of the
scope of the issue that this patch is trying to fix, which is actually a
pretty critical issue that can cause potential corruption of memory and
the fix should be picked up ASAP (and for older LTS kernels too).
Thanks,
Ray
> Bjorn
>
Powered by blists - more mailing lists