lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <9ea91582-4745-4559-97a5-65b57ead7d70@ti.com> Date: Fri, 6 Sep 2024 23:47:31 +0530 From: Beleswar Prasad Padhi <b-padhi@...com> To: Mathieu Poirier <mathieu.poirier@...aro.org> CC: <andersson@...nel.org>, <afd@...com>, <hnagalla@...com>, <s-anna@...com>, <u-kumar1@...com>, <linux-remoteproc@...r.kernel.org>, <linux-kernel@...r.kernel.org> Subject: Re: [PATCH] remoteproc: k3-r5: Decouple firmware booting from probe routine Hi Mathieu, On 06-09-2024 22:17, Mathieu Poirier wrote: > On Fri, Sep 06, 2024 at 03:10:45PM +0530, Beleswar Padhi wrote: >> The current implementation of the waiting mechanism in probe() waits for >> the 'released_from_reset' flag to be set which is done in >> k3_r5_rproc_prepare() as part of rproc_fw_boot(). This causes unexpected > If you are looking at rproc-next, @released_from_reset is set in > k3_r5_rproc_start(). You are correct. Apologies, this flag is set in the start() function (still a part of rproc_fw_boot()), not prepare(). I wanted to point out @released_from_reset is set in rproc_fw_boot() routine, and is checked for in the probe() routine. > Moreover, the waiting mechanic happens in > k3_r5_cluster_rproc_init(), which makes reading your changelog highly confusing. Yes, the mechanism is in the k3_r5_cluster_rproc_init() function which is called from k3_r5_probe(), hence I referred to it being called in the probe routine. The point I wanted to make was, any error while booting firmware would end up in a probe failure. Apologies for not making it clearer. > >> failures in cases where the firmware is unavailable at boot time, >> resulting in probe failure and removal of the remoteproc handles in the >> sysfs paths. >> >> To address this, the waiting mechanism is refactored out of the probe >> routine into the appropriate k3_r5_rproc_prepare/unprepare() and >> k3_r5_rproc_start/stop() functions. This allows the probe routine to >> complete without depending on firmware booting, while still maintaining >> the required power-synchronization between cores. >> >> Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1") >> Signed-off-by: Beleswar Padhi <b-padhi@...com> >> --- >> Posted this as a Fix as this was breaking usecases where we wanted to load a >> firmware by writing to sysfs handles in userspace. >> >> drivers/remoteproc/ti_k3_r5_remoteproc.c | 170 ++++++++++++++++------- >> 1 file changed, 118 insertions(+), 52 deletions(-) >> >> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c >> index 747ee467da88..df8f124f4248 100644 >> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c >> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c >> @@ -131,6 +131,7 @@ struct k3_r5_cluster { >> * @btcm_enable: flag to control BTCM enablement >> * @loczrama: flag to dictate which TCM is at device address 0x0 >> * @released_from_reset: flag to signal when core is out of reset >> + * @unhalted: flag to signal when core is unhalted >> */ >> struct k3_r5_core { >> struct list_head elem; >> @@ -148,6 +149,7 @@ struct k3_r5_core { >> u32 btcm_enable; >> u32 loczrama; >> bool released_from_reset; >> + bool unhalted; > Yet another flag? @released_from_reset is not enough? So, technically @released_from_reset should maintain the sync between _prepare() of #core0 and #core1. But with commit 8fa052c29e50 ("remoteproc: k3-r5: Delay notification of wakeup event"), we are trying to maintain the sync of both _prepare() and _start() with just this one flag by pushing the notification from prepare() to start(). Having two flags is a cleanup attempt, where @released_from_reset handles _prepare() sync and @unhalted handles _start() sync. > And why does it need to > be "unhalted" rather than something like "running"? "running" sounds like a better name for this flag. Thank you! > I will not move forward > with this patch until I get an R-B and a T-B from two other people at TI. > > The above and the exchange with Jan Kiszka is furthering my belief that this > driver is up for a serious refactoring exercise. From hereon I will only > consider bug fixes. I understand the concern. I will do the refactor and possibly include this patch as part of that refactoring series. Thanks, Beleswar > > Thanks, > Mathieu > >> }; >> >> /** >> @@ -448,13 +450,33 @@ static int k3_r5_rproc_prepare(struct rproc *rproc) >> { >> struct k3_r5_rproc *kproc = rproc->priv; >> struct k3_r5_cluster *cluster = kproc->cluster; >> - struct k3_r5_core *core = kproc->core; >> + struct k3_r5_core *core0, *core1, *core = kproc->core; >> struct device *dev = kproc->dev; >> u32 ctrl = 0, cfg = 0, stat = 0; >> u64 boot_vec = 0; >> bool mem_init_dis; >> int ret; >> >> + /* >> + * R5 cores require to be powered on sequentially, core0 should be in >> + * higher power state than core1 in a cluster. So, wait for core0 to >> + * power up before proceeding to core1 and put timeout of 2sec. This >> + * waiting mechanism is necessary because rproc_auto_boot_callback() for >> + * core1 can be called before core0 due to thread execution order. >> + */ >> + core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem); >> + core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem); >> + if (cluster->mode == CLUSTER_MODE_SPLIT && core == core1 && >> + core0->released_from_reset == false) { >> + ret = wait_event_interruptible_timeout(cluster->core_transition, >> + core0->released_from_reset, >> + msecs_to_jiffies(2000)); >> + if (ret <= 0) { >> + dev_err(dev, "can not power up core1 before core0"); >> + return -EPERM; >> + } >> + } >> + >> ret = ti_sci_proc_get_status(core->tsp, &boot_vec, &cfg, &ctrl, &stat); >> if (ret < 0) >> return ret; >> @@ -470,6 +492,12 @@ static int k3_r5_rproc_prepare(struct rproc *rproc) >> return ret; >> } >> >> + /* Notify all threads in the wait queue when core state has changed so >> + * that threads waiting for this condition can be executed. >> + */ >> + core->released_from_reset = true; >> + wake_up_interruptible(&cluster->core_transition); >> + >> /* >> * Newer IP revisions like on J7200 SoCs support h/w auto-initialization >> * of TCMs, so there is no need to perform the s/w memzero. This bit is >> @@ -515,14 +543,46 @@ static int k3_r5_rproc_unprepare(struct rproc *rproc) >> { >> struct k3_r5_rproc *kproc = rproc->priv; >> struct k3_r5_cluster *cluster = kproc->cluster; >> - struct k3_r5_core *core = kproc->core; >> + struct k3_r5_core *core0, *core1, *core = kproc->core; >> struct device *dev = kproc->dev; >> int ret; >> >> /* Re-use LockStep-mode reset logic for Single-CPU mode */ >> - ret = (cluster->mode == CLUSTER_MODE_LOCKSTEP || >> - cluster->mode == CLUSTER_MODE_SINGLECPU) ? >> - k3_r5_lockstep_reset(cluster) : k3_r5_split_reset(core); >> + if (cluster->mode == CLUSTER_MODE_LOCKSTEP || >> + cluster->mode == CLUSTER_MODE_SINGLECPU) >> + ret = k3_r5_lockstep_reset(cluster); >> + else { >> + /* >> + * R5 cores require to be powered off sequentially, core0 should >> + * be in higher power state than core1 in a cluster. So, wait >> + * for core1 to powered off before proceeding to core0 and put >> + * timeout of 2sec. This waiting mechanism is necessary to >> + * prevent stopping core0 before core1 from sysfs. >> + */ >> + core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem); >> + core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem); >> + >> + if (core == core0 && core1->released_from_reset == true) { >> + ret = wait_event_interruptible_timeout(cluster->core_transition, >> + !core1->released_from_reset, >> + msecs_to_jiffies(2000)); >> + >> + if (ret <= 0) { >> + dev_err(dev, "can not power off core0 before core1"); >> + return -EPERM; >> + } >> + } >> + >> + ret = k3_r5_split_reset(core); >> + >> + /* Notify all threads in the wait queue when core state has >> + * changed so that threads waiting for this condition can be >> + * executed. >> + */ >> + core->released_from_reset = false; >> + wake_up_interruptible(&cluster->core_transition); >> + } >> + >> if (ret) >> dev_err(dev, "unable to disable cores, ret = %d\n", ret); >> >> @@ -551,16 +611,34 @@ static int k3_r5_rproc_start(struct rproc *rproc) >> struct k3_r5_rproc *kproc = rproc->priv; >> struct k3_r5_cluster *cluster = kproc->cluster; >> struct device *dev = kproc->dev; >> - struct k3_r5_core *core0, *core; >> + struct k3_r5_core *core0, *core1, *core = kproc->core; >> u32 boot_addr; >> int ret; >> >> + /* >> + * R5 cores require to be powered on sequentially, core0 should be in >> + * higher power state than core1 in a cluster. So, wait for core0 to >> + * power up before proceeding to core1 and put timeout of 2sec. This >> + * waiting mechanism is necessary because rproc_auto_boot_callback() for >> + * core1 can be called before core0 due to thread execution order. >> + */ >> + core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem); >> + core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem); >> + if (cluster->mode == CLUSTER_MODE_SPLIT && core == core1 && core0->unhalted == false) { >> + ret = wait_event_interruptible_timeout(cluster->core_transition, >> + core0->unhalted, >> + msecs_to_jiffies(2000)); >> + if (ret <= 0) { >> + dev_err(dev, "can not power up core1 before core0"); >> + return -EPERM; >> + } >> + } >> + >> boot_addr = rproc->bootaddr; >> /* TODO: add boot_addr sanity checking */ >> dev_dbg(dev, "booting R5F core using boot addr = 0x%x\n", boot_addr); >> >> /* boot vector need not be programmed for Core1 in LockStep mode */ >> - core = kproc->core; >> ret = ti_sci_proc_set_config(core->tsp, boot_addr, 0, 0); >> if (ret) >> return ret; >> @@ -573,20 +651,15 @@ static int k3_r5_rproc_start(struct rproc *rproc) >> goto unroll_core_run; >> } >> } else { >> - /* do not allow core 1 to start before core 0 */ >> - core0 = list_first_entry(&cluster->cores, struct k3_r5_core, >> - elem); >> - if (core != core0 && core0->rproc->state == RPROC_OFFLINE) { >> - dev_err(dev, "%s: can not start core 1 before core 0\n", >> - __func__); >> - return -EPERM; >> - } >> - >> ret = k3_r5_core_run(core); >> if (ret) >> return ret; >> >> - core->released_from_reset = true; >> + /* Notify all threads in the wait queue when core state has >> + * changed so that threads waiting for this condition can be >> + * executed. >> + */ >> + core->unhalted = true; >> wake_up_interruptible(&cluster->core_transition); >> } >> >> @@ -629,7 +702,7 @@ static int k3_r5_rproc_stop(struct rproc *rproc) >> struct k3_r5_rproc *kproc = rproc->priv; >> struct k3_r5_cluster *cluster = kproc->cluster; >> struct device *dev = kproc->dev; >> - struct k3_r5_core *core1, *core = kproc->core; >> + struct k3_r5_core *core0, *core1, *core = kproc->core; >> int ret; >> >> /* halt all applicable cores */ >> @@ -642,19 +715,38 @@ static int k3_r5_rproc_stop(struct rproc *rproc) >> } >> } >> } else { >> - /* do not allow core 0 to stop before core 1 */ >> - core1 = list_last_entry(&cluster->cores, struct k3_r5_core, >> - elem); >> - if (core != core1 && core1->rproc->state != RPROC_OFFLINE) { >> - dev_err(dev, "%s: can not stop core 0 before core 1\n", >> - __func__); >> - ret = -EPERM; >> - goto out; >> + /* >> + * R5 cores require to be powered off sequentially, core0 should >> + * be in higher power state than core1 in a cluster. So, wait >> + * for core1 to powered off before proceeding to core0 and put >> + * timeout of 2sec. This waiting mechanism is necessary to >> + * prevent stopping core0 before core1 from sysfs. >> + */ >> + core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem); >> + core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem); >> + >> + if (core == core0 && core1->unhalted == true) { >> + ret = wait_event_interruptible_timeout(cluster->core_transition, >> + !core1->unhalted, >> + msecs_to_jiffies(2000)); >> + >> + if (ret <= 0) { >> + dev_err(dev, "can not power off core0 before core1"); >> + ret = -EPERM; >> + goto out; >> + } >> } >> >> ret = k3_r5_core_halt(core); >> if (ret) >> goto out; >> + >> + /* Notify all threads in the wait queue when core state has >> + * changed so that threads waiting for this condition can be >> + * executed. >> + */ >> + core->unhalted = false; >> + wake_up_interruptible(&cluster->core_transition); >> } >> >> return 0; >> @@ -1145,12 +1237,6 @@ static int k3_r5_rproc_configure_mode(struct k3_r5_rproc *kproc) >> return reset_ctrl_status; >> } >> >> - /* >> - * Skip the waiting mechanism for sequential power-on of cores if the >> - * core has already been booted by another entity. >> - */ >> - core->released_from_reset = c_state; >> - >> ret = ti_sci_proc_get_status(core->tsp, &boot_vec, &cfg, &ctrl, >> &stat); >> if (ret < 0) { >> @@ -1296,25 +1382,6 @@ static int k3_r5_cluster_rproc_init(struct platform_device *pdev) >> cluster->mode == CLUSTER_MODE_SINGLECORE) >> break; >> >> - /* >> - * R5 cores require to be powered on sequentially, core0 >> - * should be in higher power state than core1 in a cluster >> - * So, wait for current core to power up before proceeding >> - * to next core and put timeout of 2sec for each core. >> - * >> - * This waiting mechanism is necessary because >> - * rproc_auto_boot_callback() for core1 can be called before >> - * core0 due to thread execution order. >> - */ >> - ret = wait_event_interruptible_timeout(cluster->core_transition, >> - core->released_from_reset, >> - msecs_to_jiffies(2000)); >> - if (ret <= 0) { >> - dev_err(dev, >> - "Timed out waiting for %s core to power up!\n", >> - rproc->name); >> - goto err_powerup; >> - } >> } >> >> return 0; >> @@ -1329,7 +1396,6 @@ static int k3_r5_cluster_rproc_init(struct platform_device *pdev) >> } >> } >> >> -err_powerup: >> rproc_del(rproc); >> err_add: >> k3_r5_reserved_mem_exit(kproc); >> -- >> 2.34.1 >>
Powered by blists - more mailing lists