linux-kernel - Re: [PATCH v9 08/26] remoteproc: k3-r5: Refactor sequential core power up/down operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d41e2717-5bf4-4a7e-92fa-705836702d8f@ti.com>
Date: Tue, 8 Apr 2025 14:11:03 +0530
From: Beleswar Prasad Padhi <b-padhi@...com>
To: Andrew Davis <afd@...com>, <andersson@...nel.org>,
        <mathieu.poirier@...aro.org>
CC: <hnagalla@...com>, <u-kumar1@...com>, <jm@...com>,
        <jan.kiszka@...mens.com>, <christophe.jaillet@...adoo.fr>,
        <jkangas@...hat.com>, <eballetbo@...hat.com>,
        <linux-remoteproc@...r.kernel.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v9 08/26] remoteproc: k3-r5: Refactor sequential core
 power up/down operations

Hi Andrew,

On 07/04/25 19:15, Andrew Davis wrote:
> On 3/17/25 7:06 AM, Beleswar Padhi wrote:
>> The existing implementation of the waiting mechanism in
>> "k3_r5_cluster_rproc_init()" waits for the "released_from_reset" flag to
>> be set as part of the firmware boot process in "k3_r5_rproc_start()".
>> The "k3_r5_cluster_rproc_init()" function is invoked in the probe
>> routine which causes unexpected failures in cases where the firmware is
>> unavailable at boot time, resulting in probe failure and removal of the
>> remoteproc handles in the sysfs paths.
>>
>> To address this, the waiting mechanism is refactored out of the probe
>> routine into the appropriate "k3_r5_rproc_{prepare/unprepare}()"
>> functions. This allows the probe routine to complete without depending
>> on firmware booting, while still maintaining the required
>> power-synchronization between cores.
>>
>> Further, this wait mechanism is dropped from
>> "k3_r5_rproc_{start/stop}()" functions as they deal with Core Run/Halt
>> operations, and as such, there is no constraint in Running or Halting
>> the cores of a cluster in order.
>>
>> Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up 
>> before powering up core1")
>> Signed-off-by: Beleswar Padhi <b-padhi@...com>
>> ---
>
> Same as the above two patches in this series, these are all valid 
> fixes, but should be
> done first before the refactoring begins, so move them to the start of 
> the series.


Thanks I will incorporate those changes in the revision. Please let me 
know if you have finished reviewing this patchset, so I can re-spin v10.

Thanks,
Beleswar

>
> Andrew
>
>>   drivers/remoteproc/ti_k3_r5_remoteproc.c | 114 +++++++++++++----------
>>   1 file changed, 65 insertions(+), 49 deletions(-)
>>
>> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c 
>> b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>> index c0e4da82775d..30081eafbd36 100644
>> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
>> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
>> @@ -475,7 +475,7 @@ static int k3_r5_rproc_request_mbox(struct rproc 
>> *rproc)
>>   static int k3_r5_rproc_prepare(struct rproc *rproc)
>>   {
>>       struct k3_r5_rproc *kproc = rproc->priv;
>> -    struct k3_r5_core *core = kproc->priv;
>> +    struct k3_r5_core *core = kproc->priv, *core0, *core1;
>>       struct k3_r5_cluster *cluster = core->cluster;
>>       struct device *dev = kproc->dev;
>>       u32 ctrl = 0, cfg = 0, stat = 0;
>> @@ -483,6 +483,29 @@ static int k3_r5_rproc_prepare(struct rproc *rproc)
>>       bool mem_init_dis;
>>       int ret;
>>   +    /*
>> +     * R5 cores require to be powered on sequentially, core0 should 
>> be in
>> +     * higher power state than core1 in a cluster. So, wait for 
>> core0 to
>> +     * power up before proceeding to core1 and put timeout of 2sec. 
>> This
>> +     * waiting mechanism is necessary because 
>> rproc_auto_boot_callback() for
>> +     * core1 can be called before core0 due to thread execution order.
>> +     *
>> +     * By placing the wait mechanism here in .prepare() ops, this 
>> condition
>> +     * is enforced for rproc boot requests from sysfs as well.
>> +     */
>> +    core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem);
>> +    core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem);
>> +    if (cluster->mode == CLUSTER_MODE_SPLIT && core == core1 &&
>> +        !core0->released_from_reset) {
>> +        ret = 
>> wait_event_interruptible_timeout(cluster->core_transition,
>> +                               core0->released_from_reset,
>> +                               msecs_to_jiffies(2000));
>> +        if (ret <= 0) {
>> +            dev_err(dev, "can not power up core1 before core0");
>> +            return -EPERM;
>> +        }
>> +    }
>> +
>>       ret = ti_sci_proc_get_status(kproc->tsp, &boot_vec, &cfg, 
>> &ctrl, &stat);
>>       if (ret < 0)
>>           return ret;
>> @@ -498,6 +521,14 @@ static int k3_r5_rproc_prepare(struct rproc *rproc)
>>           return ret;
>>       }
>>   +    /*
>> +     * Notify all threads in the wait queue when core0 state has 
>> changed so
>> +     * that threads waiting for this condition can be executed.
>> +     */
>> +    core->released_from_reset = true;
>> +    if (core == core0)
>> + wake_up_interruptible(&cluster->core_transition);
>> +
>>       /*
>>        * Newer IP revisions like on J7200 SoCs support h/w 
>> auto-initialization
>>        * of TCMs, so there is no need to perform the s/w memzero. 
>> This bit is
>> @@ -542,11 +573,31 @@ static int k3_r5_rproc_prepare(struct rproc 
>> *rproc)
>>   static int k3_r5_rproc_unprepare(struct rproc *rproc)
>>   {
>>       struct k3_r5_rproc *kproc = rproc->priv;
>> -    struct k3_r5_core *core = kproc->priv;
>> +    struct k3_r5_core *core = kproc->priv, *core0, *core1;
>>       struct k3_r5_cluster *cluster = core->cluster;
>>       struct device *dev = kproc->dev;
>>       int ret;
>>   +    /*
>> +     * Ensure power-down of cores is sequential in split mode. Core1 
>> must
>> +     * power down before Core0 to maintain the expected state. By 
>> placing
>> +     * the wait mechanism here in .unprepare() ops, this condition is
>> +     * enforced for rproc stop or shutdown requests from sysfs and 
>> device
>> +     * removal as well.
>> +     */
>> +    core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem);
>> +    core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem);
>> +    if (cluster->mode == CLUSTER_MODE_SPLIT && core == core0 &&
>> +        core1->released_from_reset) {
>> +        ret = 
>> wait_event_interruptible_timeout(cluster->core_transition,
>> +                               !core1->released_from_reset,
>> +                               msecs_to_jiffies(2000));
>> +        if (ret <= 0) {
>> +            dev_err(dev, "can not power down core0 before core1");
>> +            return -EPERM;
>> +        }
>> +    }
>> +
>>       /* Re-use LockStep-mode reset logic for Single-CPU mode */
>>       ret = (cluster->mode == CLUSTER_MODE_LOCKSTEP ||
>>              cluster->mode == CLUSTER_MODE_SINGLECPU) ?
>> @@ -554,6 +605,14 @@ static int k3_r5_rproc_unprepare(struct rproc 
>> *rproc)
>>       if (ret)
>>           dev_err(dev, "unable to disable cores, ret = %d\n", ret);
>>   +    /*
>> +     * Notify all threads in the wait queue when core1 state has 
>> changed so
>> +     * that threads waiting for this condition can be executed.
>> +     */
>> +    core->released_from_reset = false;
>> +    if (core == core1)
>> + wake_up_interruptible(&cluster->core_transition);
>> +
>>       return ret;
>>   }
>>   @@ -577,7 +636,7 @@ static int k3_r5_rproc_unprepare(struct rproc 
>> *rproc)
>>   static int k3_r5_rproc_start(struct rproc *rproc)
>>   {
>>       struct k3_r5_rproc *kproc = rproc->priv;
>> -    struct k3_r5_core *core0, *core = kproc->priv;
>> +    struct k3_r5_core *core = kproc->priv;
>>       struct k3_r5_cluster *cluster = core->cluster;
>>       struct device *dev = kproc->dev;
>>       u32 boot_addr;
>> @@ -600,21 +659,9 @@ static int k3_r5_rproc_start(struct rproc *rproc)
>>                   goto unroll_core_run;
>>           }
>>       } else {
>> -        /* do not allow core 1 to start before core 0 */
>> -        core0 = list_first_entry(&cluster->cores, struct k3_r5_core,
>> -                     elem);
>> -        if (core != core0 && core0->kproc->rproc->state == 
>> RPROC_OFFLINE) {
>> -            dev_err(dev, "%s: can not start core 1 before core 0\n",
>> -                __func__);
>> -            return -EPERM;
>> -        }
>> -
>> -        ret = k3_r5_core_run(core->kproc);
>> +        ret = k3_r5_core_run(kproc);
>>           if (ret)
>>               return ret;
>> -
>> -        core->released_from_reset = true;
>> - wake_up_interruptible(&cluster->core_transition);
>>       }
>>         return 0;
>> @@ -654,9 +701,8 @@ static int k3_r5_rproc_start(struct rproc *rproc)
>>   static int k3_r5_rproc_stop(struct rproc *rproc)
>>   {
>>       struct k3_r5_rproc *kproc = rproc->priv;
>> -    struct k3_r5_core *core1, *core = kproc->priv;
>> +    struct k3_r5_core *core = kproc->priv;
>>       struct k3_r5_cluster *cluster = core->cluster;
>> -    struct device *dev = kproc->dev;
>>       int ret;
>>         /* halt all applicable cores */
>> @@ -669,17 +715,7 @@ static int k3_r5_rproc_stop(struct rproc *rproc)
>>               }
>>           }
>>       } else {
>> -        /* do not allow core 0 to stop before core 1 */
>> -        core1 = list_last_entry(&cluster->cores, struct k3_r5_core,
>> -                    elem);
>> -        if (core != core1 && core1->kproc->rproc->state != 
>> RPROC_OFFLINE) {
>> -            dev_err(dev, "%s: can not stop core 0 before core 1\n",
>> -                __func__);
>> -            ret = -EPERM;
>> -            goto out;
>> -        }
>> -
>> -        ret = k3_r5_core_halt(core->kproc);
>> +        ret = k3_r5_core_halt(kproc);
>>           if (ret)
>>               goto out;
>>       }
>> @@ -1441,26 +1477,6 @@ static int k3_r5_cluster_rproc_init(struct 
>> platform_device *pdev)
>>               cluster->mode == CLUSTER_MODE_SINGLECPU ||
>>               cluster->mode == CLUSTER_MODE_SINGLECORE)
>>               break;
>> -
>> -        /*
>> -         * R5 cores require to be powered on sequentially, core0
>> -         * should be in higher power state than core1 in a cluster
>> -         * So, wait for current core to power up before proceeding
>> -         * to next core and put timeout of 2sec for each core.
>> -         *
>> -         * This waiting mechanism is necessary because
>> -         * rproc_auto_boot_callback() for core1 can be called before
>> -         * core0 due to thread execution order.
>> -         */
>> -        ret = 
>> wait_event_interruptible_timeout(cluster->core_transition,
>> -                               core->released_from_reset,
>> -                               msecs_to_jiffies(2000));
>> -        if (ret <= 0) {
>> -            dev_err(cdev,
>> -                "Timed out waiting for %s core to power up!\n",
>> -                rproc->name);
>> -            goto out;
>> -        }
>>       }
>>         return 0;