lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <03bde95c-dfd3-cdf6-2b0f-afa6a0ec036d@quicinc.com>
Date:   Tue, 18 Jan 2022 14:57:08 +0530
From:   Mukesh Ojha <quic_mojha@...cinc.com>
To:     Bjorn Andersson <bjorn.andersson@...aro.org>
CC:     <linux-remoteproc@...r.kernel.org>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: Query on moving Recovery remoteproc work to a separate wq instead
 of system freezable wq


On 1/18/2022 3:50 AM, Bjorn Andersson wrote:
> On Mon 17 Jan 09:09 CST 2022, Mukesh Ojha wrote:
>
>> Hi,
>>
>> There could be a situation there is too much load(of tasks which is affined
> As in "it's theoretically possible" or "we run into this issue all the
> time"?

During recovery we notify all the remoteproc kernel clients about the 
crash and if one of the notification gets stuck

for more than 20-30s we ideally inject panic . During analysis, We saw 
that just because of the load(stress testing) on the

core we are not able to proceed. would be good to avail this work to run 
on different CPU.

>
>> to particular core) on a core on which  rproc
>> recovery thread will not get a chance to run with no reason but the load. If
>> we make this queue unbound, then this work
>> can run on any core.
>>
>> Kindly Let me if i can post a proper patch for this like below.
>>
>> --- a/drivers/remoteproc/remoteproc_core.c
>> +++ b/drivers/remoteproc/remoteproc_core.c
>> @@ -59,6 +59,7 @@ static int rproc_release_carveout(struct rproc *rproc,
>>
>>   /* Unique indices for remoteproc devices */
>>   static DEFINE_IDA(rproc_dev_index);
>> +static struct workqueue_struct *rproc_recovery_wq;
>>
>>   static const char * const rproc_crash_names[] = {
>>          [RPROC_MMUFAULT]        = "mmufault",
>> @@ -2487,7 +2488,7 @@ void rproc_report_crash(struct rproc *rproc, enum
>> rproc_crash_type type)
>>                  rproc->name, rproc_crash_to_string(type));
>>
>>          /* Have a worker handle the error; ensure system is not suspended */
>> -       queue_work(system_freezable_wq, &rproc->crash_handler);
>> +       queue_work(rproc_recovery_wq, &rproc->crash_handler);
>>   }
>>   EXPORT_SYMBOL(rproc_report_crash);
>>
>> @@ -2532,6 +2533,12 @@ static void __exit rproc_exit_panic(void)
>>
>>   static int __init remoteproc_init(void)
>>   {
>> +       rproc_recovery_wq = alloc_workqueue("rproc_recovery_wq", WQ_UNBOUND
>> |
>> +                               WQ_HIGHPRI | WQ_FREEZABLE |
>> WQ_CPU_INTENSIVE, 0);
> Afaict this is not only a separate work queue, but a high priority, "cpu
> intensive" work queue. Does that really represent the urgency of getting
> the recovery under way?

Adding a WQ_CPU_INTENSIVE(no use) here is a blunder from my end, will 
remove this.

Thanks,
-Mukesh

> Regards,
> Bjorn
>
>> +       if (!rproc_recovery_wq) {
>> +               pr_err("creation of rproc_recovery_wq failed\n");
>> +       }
>> +
>>
>> Thanks,
>> Mukesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ