linux-kernel - Re: [v6,3/4] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1609209225.14736.192.camel@mhfsdcap03>
Date:   Tue, 29 Dec 2020 10:33:45 +0800
From:   Jianjun Wang <jianjun.wang@...iatek.com>
To:     Marc Zyngier <maz@...nel.org>
CC:     <youlin.pei@...iatek.com>, <devicetree@...r.kernel.org>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        <qizhong.cheng@...iatek.com>, <chuanjia.liu@...iatek.com>,
        <linux-pci@...r.kernel.org>, Ryder Lee <ryder.lee@...iatek.com>,
        Rob Herring <robh+dt@...nel.org>,
        <linux-kernel@...r.kernel.org>, <sin_jieyang@...iatek.com>,
        Sj Huang <sj.huang@...iatek.com>, <drinkcat@...omium.org>,
        <linux-mediatek@...ts.infradead.org>,
        Philipp Zabel <p.zabel@...gutronix.de>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        <linux-arm-kernel@...ts.infradead.org>,
        Matthias Brugger <matthias.bgg@...il.com>
Subject: Re: [v6,3/4] PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192

On Mon, 2020-12-28 at 15:12 +0000, Marc Zyngier wrote:
> On Mon, 28 Dec 2020 12:01:57 +0000,
> Jianjun Wang <jianjun.wang@...iatek.com> wrote:
> > 
> > On Fri, 2020-12-25 at 19:22 +0000, Marc Zyngier wrote:
> 
> Dropped	<Project_Global_Chrome_Upstream_Group@...iatek.com>, as it
> bounces:
> 
> <Project_Global_Chrome_Upstream_Group@...iatek.com>: host
>     mailgw01.mediatek.com[216.200.240.184] said: 550 Relaying mail to
>     project_global_chrome_upstream_group@...iatek.com is not allowed (in reply
>     to RCPT TO command)
> 
> Please make sure you don't Cc email addresses that cannot accept email
> form the outside world.

Apology, they are team members of mt8192 project, who want to know the
status of this patch. I will replace with the specified email address at
next time, instead of this group address.

> [...]
> 
> > > > +/**
> > > > + * struct mtk_pcie_msi - MSI information for each set
> > > > + * @base: IO mapped register base
> > > > + * @irq: MSI set Interrupt number
> > > > + * @index: MSI set number
> > > > + * @msg_addr: MSI message address
> > > > + * @domain: IRQ domain
> > > > + */
> > > > +struct mtk_pcie_msi {
> > > > +	void __iomem *base;
> > > > +	unsigned int irq;
> > > > +	int index;
> > > > +	phys_addr_t msg_addr;
> > > > +	struct irq_domain *domain;
> > > > +};
> > > 
> > > This looks odd. You seem to say that this covers a set if MSIs, and
> > > yet the irq field here clearly isn't part of a set. Is that per MSI
> > > instead? Either way, something is not quite as it should be.
> > > 
> > 
> > Appreciate all these comments, please allow me to explain the MSI
> > interrupt design in this HW.
> > 
> > The HW design of MSI interrupts will be like the following:
> > 
> >                   +-----+
> >                   | GIC |
> >                   +-----+
> >                      ^
> >                      |
> >                      |[port->irq]
> >                      |
> >              +-+-+-+-+-+-+-+-+
> >              |0|1|2|3|4|5|6|7|[PCIe intc]
> >              +-+-+-+-+-+-+-+-+
> >               ^ ^           ^
> >               | |    ...    |[msi_info->irq]
> >       +-------+ +------+    +-----------+
> >       |                |                |
> > +-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
> > |0|1|...|30|31|  |0|1|...|30|31|  |0|1|...|30|31|[MSI sets]
> > +-+-+---+--+--+  +-+-+---+--+--+  +-+-+---+--+--+
> >  ^ ^      ^  ^    ^ ^      ^  ^    ^ ^      ^  ^
> >  | |      |  |    | |      |  |    | |      |  | [MSI irq]
> >  | |      |  |    | |      |  |    | |      |  |
> > 
> >   (MSI SET0)       (MSI SET1)  ...   (MSI SET7)
> > 
> 
> Thanks, that's really helpful. You should consider adding this to the
> driver code, as part of the documentation.

Sure, I will add this to the driver documentation in the next version.

> 
> > In software parts, the port->msi_top_domain is created to maintains 8
> > MSI IRQs from the PCIe intc layer, its hardware IRQ will be mapped to
> > msi_info->irq by irq_create_mapping.
> > 
> > The port->msi_domain contains 256 MSI IRQs in total, it consist of 8 MSI
> > sets, and each MSI set contains 32 MSI IRQs.
> > 
> > The structure of mtk_pcie_msi is used to describe the MSI set, I think
> > it will be more convenient and comply with the HW design when use this
> > structure, we can get the information of MSI set directly, instead of
> > calculated by port->base.
> > 
> > When a MSI interrupt is received, the interrupt handle flow will be like
> > the following:
> > 
> >               mtk_pcie_irq_handler (port->irq)
> >                         |
> >                         |(find mapping in msi_top_domain)
> >                         |
> >                         v
> >                 mtk_pcie_msi_handler (msi_info->irq)
> >                         |
> >                         |(find mapping in msi_domain)
> >                         |
> >                         v
> >                  handle_edge_irq  (MSI irq)
> >                         |
> >                         |
> >                         v
> >             (dispatch to device handler)
> > 
> > Yes, I had to admit that it's not a quite good solution of irqdomains,
> > since the local irq domain is partial coupled with the standard PCI MSI
> > irqdomain.
> > 
> > Should I need to create another irqdomain to maintain the MSI sets
> > layer, and set it as the parent domain of PCI MSI domain? I really need
> > your suggestions, thanks a lot.
> 
> <rant>
> My first suggestion would be to go back to whoever put this HW block
> together and make sure they never write a line of RTL ever again.
> Programming using copy/paste is bad enough, but doing the same thing
> with HW is just terrible.
> </rant>
> 
> Ideally, you would model the whole thing as 8 distinct MSI
> controllers, as that is effectively what they are. Evidently, this
> would cause all kind of issues when associating PCI devices with their
> MSI domain, so let's not do that.
> 
> So what we want is to have a single MSI domain (with 256 entries,
> hence covering all sets), and compute the hwirq from the level below,
> depending on the interrupt PCIe interrupt, and the index of the MSI in
> the status register for this PCIe interrupt. I think you have most of
> the information already, you just need to simplify the way you keep
> track of it, and maybe come up with better names for the various
> blocks.
> 
> I expect you would need something like this:
> 
> struct mtk_msi_set {
> 	void __iomem		*base;
> 	phys_addr_t		msg_addr;
> };
> 
> struct mtk_pcie_port {
> [...]
> 	int 			base_irq;
> 	struct mtk_msi_set	msi_sets[8];
> [...]
> };
> 
> The assumption is that you can build the bottom MSI domain hwirq as:
> 
>     hwirq = set_idx * 32 + set_status_bit;
> 
> From that, you can directly go back from a hwirq to a set, and index
> the msi_sets[] array to find the relevant information.
> 
> Another assumption is that all 8 cascade interrupts are contiguous
> (which is pretty easy to guarantee), meaning that you can directly use
> the PCIe interrupt to compute the set number.
> 
> With all that, you can directly keep a pointer from the bottom MSI
> domain irq_data structures to the mtk_pcie_port structure, providing
> you with all the information you need.
> 
> Once you will have moved the various bits at the right place, and used
> the established MSI abstractions, things should just work.

Thanks for your quick reply, I will make the changes as this suggestion
in the next version.
> 
> [...]
> 
> > > > +static void mtk_intx_eoi(struct irq_data *data)
> > > > +{
> > > > +	struct mtk_pcie_port *port = irq_data_get_irq_chip_data(data);
> > > > +	unsigned long hwirq;
> > > > +
> > > > +	/**
> > > > +	 * As an emulated level IRQ, its interrupt status will remain
> > > > +	 * until the corresponding de-assert message is received; hence that
> > > > +	 * the status can only be cleared when the interrupt has been serviced.
> > > > +	 */
> > > > +	hwirq = data->hwirq + PCIE_INTX_SHIFT;
> > > > +	writel(BIT(hwirq), port->base + PCIE_INT_STATUS_REG);
> > > 
> > > All of this is the description of a level interrupt, so this is pretty
> > > much devoid of any information as to *why* you need to write to clear
> > > this bit. What happens if the interrupt is still asserted because
> > > nothing has handled it? Without any further information, this looks
> > > terribly wrong.
> > 
> > Sorry, this comment should be used to describe the mtk_intx_eoi
> > function, it cause misunderstandings at this place. I just want to add
> > the comment to explain that why this interrupt needs to be acked at the
> > end of interrupt. I will move it to the front of mtk_intx_eoi in the
> > next version.
> 
> Is it an ack or an eoi? These are two different things, and really
> depends on your HW. Please find out which one it is, and we will work
> out a way to handle it.

Actually, it's an ack which used to clear the interrupt status.

When we receive an INTx assert message from the EP device, the
corresponding bit of PCIE_INT_STATUS_REG will be set, and it can not be
cleared until the EP device handler clear its device status and then we
receives the INTx de-assert message.

So I set handle_fasteoi_irq as its irq handler, and try to use irq_eoi
callback to make sure it will be called after device handler.

[snip]
> +static int mtk_pcie_intx_map(struct irq_domain *domain, unsigned int
irq,
> +                          irq_hw_number_t hwirq)
> +{
> +     irq_set_chip_and_handler_name(irq, &mtk_intx_irq_chip,
> +                                   handle_fasteoi_irq, "INTx");
> +     irq_set_chip_data(irq, domain->host_data);
> +
> +     return 0;
> +}
> +
> +static const struct irq_domain_ops intx_domain_ops = {
> +     .map = mtk_pcie_intx_map,
> +};

> 
> Thanks,
> 
> 	M.
>