lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKddAkC11Hr=5jVpYvoMd2FeKvULn8mHcuey3=0DGEjAf7Q+8g@mail.gmail.com>
Date: Fri, 15 Aug 2025 15:33:57 +0800
From: Nick Hu <nick.hu@...ive.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: anup@...infault.org, Alexandre Ghiti <alex@...ti.fr>, linux-riscv@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, Paul Walmsley <paul.walmsley@...ive.com>, 
	Palmer Dabbelt <palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>
Subject: Re: [PATCH v2 2/2] irqchip/riscv-aplic: Save and restore APLIC registers

On Wed, Aug 6, 2025 at 8:52 PM Thomas Gleixner <tglx@...utronix.de> wrote:
>
> On Wed, Aug 06 2025 at 16:27, Nick Hu wrote:
> > The APLIC may be powered down when the CPUs enter a deep sleep state.
> > Therefore adding the APLIC save and restore functions to save and
> > restore the states of APLIC.
>
> Same comments vs. subject and change log.
>
> >
> > +void aplic_direct_restore(struct aplic_priv *priv)
> > +{
> > +     struct aplic_direct *direct =
> > +                     container_of(priv, struct aplic_direct, priv);
>
> No line break required.
>
> > +     int cpu;
> > +
> > +     for_each_cpu(cpu, &direct->lmask)
> > +             aplic_idc_set_delivery(per_cpu_ptr(&aplic_idcs, cpu), true);
> > +static LIST_HEAD(aplics);
> > +
> > +static void aplic_restore(struct aplic_priv *priv)
> > +{
> > +     int i;
> > +     u32 clrip;
>
> See the documentation I linked you to and look for the chapter about
> variable declarations.
>
> > +     writel(priv->saved_domaincfg, priv->regs + APLIC_DOMAINCFG);
> > +#ifdef CONFIG_RISCV_M_MODE
> > +     writel(priv->saved_msiaddr, priv->regs + APLIC_xMSICFGADDR);
> > +     writel(priv->saved_msiaddrh, priv->regs + APLIC_xMSICFGADDRH);
> > +#endif
> > +     for (i = 1; i <= priv->nr_irqs; i++) {
> > +             writel(priv->saved_sourcecfg[i - 1],
> > +                    priv->regs + APLIC_SOURCECFG_BASE +
> > +                    (i - 1) * sizeof(u32));
> > +             writel(priv->saved_target[i - 1],
> > +                    priv->regs + APLIC_TARGET_BASE +
> > +                    (i - 1) * sizeof(u32));
>
> Sorry, but this is incomprehensible garbage.
>
> Why are you starting with i = 1 when you subtract 1 from i at every
> usage site?
>
The AIA specification states, “The number zero is not a valid
interrupt identity number at an APLIC,”
so I thought it was intuitive to start with i = 1 up to
`priv->nr_irqs`. However, you are right, I should start
from 0 so the index doesn’t need to be offset by 1 in every iteration.

>         u32 __iomem *regs = priv->regs;
>         for (i = 0; i < priv->nr_irqs; i++, regs++) {
>                 writel(priv->saved_sourcecfg[i], regs + APLIC_SOURCECFG_BASE);
>                 writel(priv->saved_target[i], regs + APLIC_TARGET_BASE);
>         }
>
> See?
>
> > +     }
> > +
> > +     for (i = 0; i <= priv->nr_irqs; i += 32) {
>
> Off by one. This needs to be i < priv->nr_irqs, no?
>
We need to preserve bits 1 through priv->nr_irqs of the SETIE
register, so this should be i <= priv->nr_irqs.

> > +             writel(-1U, priv->regs + APLIC_CLRIE_BASE +
> > +                         (i / 32) * sizeof(u32));
> > +             writel(priv->saved_ie[i / 32],
> > +                    priv->regs + APLIC_SETIE_BASE +
> > +                    (i / 32) * sizeof(u32));
> > +     }
> > +
> > +     if (priv->nr_idcs) {
> > +             aplic_direct_restore(priv);
> > +     } else {
> > +             /* Re-trigger the interrupts */
> > +             for (i = 0; i <= priv->nr_irqs; i += 32) {
>
> Same here
>
> > +                     clrip = readl(priv->regs + APLIC_CLRIP_BASE +
> > +                                   (i / 32) * sizeof(u32));
> > +                     writel(clrip, priv->regs + APLIC_SETIP_BASE +
> > +                                   (i / 32) * sizeof(u32));
> > +             }
>
> And this can properly be combined into:
>
>         u32 __iomem *regs = priv->regs;
>
>         for (i = 0; i < priv->nr_irqs; i += 32, regs++) {
>                 writel(-1U, regs + APLIC_CLRIE_BASE);
>                 writel(priv->saved_ie[i / 32], regs + APLIC_SETIE_BASE);
>
>                 if (!priv->nr_idcs)
>                         writel(readl(regs + APLOC_CLRIP_BASE), regs + APLIC_SETIP_BASE);
>         }
>
>         if (priv->nr_idcs)
>                 aplic_direct_restore(priv);
>
> No?
>
I'll update it in the next version.

> > +}
> > +
> > +static void aplic_save(struct aplic_priv *priv)
> > +{
> > +     int i;
> > +
> > +     for (i = 1; i <= priv->nr_irqs; i++) {
> > +             priv->saved_target[i - 1] = readl(priv->regs +
> > +                                               APLIC_TARGET_BASE +
> > +                                               (i - 1) * sizeof(u32));
> > +     }
>
> Oh well...
>
> > +
> > +     for (i = 0; i <= priv->nr_irqs; i += 32) {
>
> Off by one again ...
>
> > +             priv->saved_ie[i / 32] = readl(priv->regs +
> > +                                            APLIC_SETIE_BASE +
> > +                                            (i / 32) * sizeof(u32));
> > +     }
> > +}
> > +
> > +static int aplic_syscore_suspend(void)
> > +{
> > +     struct aplic_priv *priv;
> > +
> > +     list_for_each_entry(priv, &aplics, head) {
> > +             aplic_save(priv);
> > +     }
>
> Superflous brackets
>
> > +
> > +     return 0;
> > +}
> > +
> > +static void aplic_syscore_resume(void)
> > +{
> > +     struct aplic_priv *priv;
> > +
> > +     list_for_each_entry(priv, &aplics, head) {
> > +             aplic_restore(priv);
> > +     }
>
> Ditto
>
> > +}
> > +
> > +static struct syscore_ops aplic_syscore_ops = {
> > +     .suspend = aplic_syscore_suspend,
> > +     .resume = aplic_syscore_resume,
>
> See documentation about struct definitions and initializers
>
> > +};
> > +
> > +static int aplic_pm_notifier(struct notifier_block *nb, unsigned long action, void *data)
> > +{
> > +     struct aplic_priv *priv = container_of(nb, struct aplic_priv, genpd_nb);
> > +
> > +     switch (action) {
> > +     case GENPD_NOTIFY_PRE_OFF:
> > +             aplic_save(priv);
> > +             break;
> > +     case GENPD_NOTIFY_ON:
> > +             aplic_restore(priv);
> > +             break;
> > +     default:
> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static void aplic_remove(void *data)
> > +{
> > +     struct aplic_priv *priv = data;
> > +
> > +     list_del(&priv->head);
>
> How is this list operation serialized?
>
I'll add a lock to prevent the race from binding/unbinding the driver
at runtime, Thanks.

> > +     dev_pm_genpd_remove_notifier(priv->dev);
> > +}
> > +
> > +static int aplic_add(struct device *dev, struct aplic_priv *priv)
> > +{
> > +     int ret;
> > +
> > +     list_add(&priv->head, &aplics);
> > +     /* Add genpd notifier */
> > +     priv->genpd_nb.notifier_call = aplic_pm_notifier;
> > +     ret = dev_pm_genpd_add_notifier(dev, &priv->genpd_nb);
> > +     if (ret && ret != -ENODEV && ret != -EOPNOTSUPP) {
>
> What is this magic about? Lacks explanation and rationale.
>
I’ll add the explanation:
It is okay when the platforms doesn't enable CONFIG_PM_GENERIC_DOMAINS
or define the
`power-domains` property in the APLIC DT node.

> > +             list_del(&priv->head);
> > +             return ret;
>
> ...
>
> > +     /* For power management */
> > +     priv->saved_target = devm_kzalloc(dev, priv->nr_irqs * sizeof(u32),
> > +                                       GFP_KERNEL);
> > +     if (!priv->saved_target)
> > +             return -ENOMEM;
> > +
> > +     priv->saved_sourcecfg = devm_kzalloc(dev, priv->nr_irqs * sizeof(u32),
> > +                                          GFP_KERNEL);
> > +     if (!priv->saved_sourcecfg)
> > +             return -ENOMEM;
> > +
> > +     priv->saved_ie = devm_kzalloc(dev,
> > +                                   DIV_ROUND_UP(priv->nr_irqs, 32) * sizeof(u32),
> > +                                   GFP_KERNEL);
> > +     if (!priv->saved_ie)
> > +             return -ENOMEM;
>
> Seriously? You allocate all of this seperately? Ever heard about the
> concept of data structures?
>
> struct savedregs {
>         u32     target;
>         u32     source;
>         u32     ie;
> };
>
>         priv->savedregs = devm_kcalloc(dev, priv->nr_irqs, sizeof(*priv->savedregs, GFP_KERNEL);
>
> Yes, I know you don't need as many ie registers, but that's not a
> problem at all. You just need to write/read to/from index 0, 32, 64 ...
>
> And your restore muck boils down to a single loop:
>
>         u32 __iomem *regs = priv->regs;
>         for (i = 0; i < priv->nr_irqs; i++, regs++) {
>                 writel(priv->savedregs->source[i], regs + APLIC_SOURCECFG_BASE);
>                 writel(priv->savedregs->target[i], regs + APLIC_TARGET_BASE);
>
>                 if (i % 32)
>                         continue;
>
>                 writel(-1U, regs + APLIC_CLRIE_BASE);
>                 writel(priv->saved_ie[i], regs + APLIC_SETIE_BASE);
>                 if (!priv->nr_idcs)
>                         writel(readl(regs + APLOC_CLRIP_BASE), regs + APLIC_SETIP_BASE);
>         }
>
> That's too comprehensible, right?
>
I can update it to:
         u32 __iomem *regs = priv->regs;
         for (i = 0; i <= priv->nr_irqs; i++) {
                 if (i < priv->nr_irqs) {
                         writel(priv->savedregs->source[i], regs +
APLIC_SOURCECFG_BASE + i);
                         writel(priv->savedregs->target[i], regs +
APLIC_TARGET_BASE + i);
                 }

                 if (i % 32)
                         continue;

                 writel(-1U, regs + APLIC_CLRIE_BASE + i / 32);
                 writel(priv->savedregs->ie[i], regs +
APLIC_SETIE_BASE + i  / 32);
                 if (!priv->nr_idcs)
                         writel(readl(regs + APLOC_CLRIP_BASE + i /
32), regs + APLIC_SETIP_BASE + i / 32);
         }

> >  struct aplic_priv {
> > +     struct list_head        head;
>
> Why needs this to be at the top?
>
> There is no reason at all. Not that it matters much as this data
> structure is a trainwreck vs. cacheline efficiency anyway.
>
Does it make sense to you if I put the hot data together ? Like:
        struct aplic_source_regs {
                u32                     *target;
                u32                     *sourcecfg;
                u32                     *ie;
        };

        struct aplic_saved_regs {
                u32                     domaincfg;
        #ifdef CONFIG_RISCV_M_MODE
                u32                     msiaddr;
                u32                     msiaddrh;
        #endif
                struct aplic_source_regs    *srcs
        };

        struct aplic_priv {
                struct list_head        head;
                struct aplic_saved_regs saved_regs;
                void __iomem            *regs;
                u32                     nr_irqs;
                u32                     nr_idcs;
                u32                     gsi_base;
                u32                     acpi_aplic_id;
                struct device           *dev;
                struct aplic_msicfg     msicfg;
                struct notifier_block   genpd_nb;
        };

Or any other suggestion?

> >       struct device           *dev;
> >       u32                     gsi_base;
> >       u32                     nr_irqs;
> > @@ -31,6 +32,15 @@ struct aplic_priv {
> >       u32                     acpi_aplic_id;
> >       void __iomem            *regs;
> >       struct aplic_msicfg     msicfg;
> > +     struct notifier_block   genpd_nb;
> > +     u32 *saved_target;
>
> So you aligned @head and @genpd_nb, but then you use random formatting
> to turn this into visual clutter.
>
Sorry about that.

> I'm amazed that this lot has received three Reviewed-by tags and nobody
> found even the slightest issue with it. Oh well.
>
> Thanks,
>
>         tglx
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ