lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fe8f6f37-2534-6481-cc34-e561b3b7a510@quicinc.com>
Date:   Tue, 14 Nov 2023 17:42:29 +0530
From:   Mukesh Ojha <quic_mojha@...cinc.com>
To:     huangzaiyang <huangzaiyang@...o.com>, <myungjoo.ham@...sung.com>,
        <kyungmin.park@...sung.com>, <cw00.choi@...sung.com>
CC:     <linux-pm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        huangzaiyang <joyyoung.wang@...il.com>
Subject: Re: [PATCH] Performance: devfreq: avoid devfreq delay work re-init
 before



On 11/10/2023 3:04 PM, huangzaiyang wrote:
> From: huangzaiyang <joyyoung.wang@...il.com>
> 
> There is a timer_list race condition when executing the following test shell script:
> '''
> while true
> do
>          echo "simple_ondemand" > /sys/class/devfreq/1d84000.ufshc/governor
>          echo "performance" > /sys/class/devfreq/1d84000.ufshc/governor
> done
> '''
> 
> [13511.214366][    C3] Unable to handle kernel paging request at virtual address dead00000000012a
> [13511.214393][    C3] Mem abort info:
> [13511.214398][    C3]   ESR = 0x96000044
> [13511.214404][    C3]   EC = 0x25: DABT (current EL), IL = 32 bits
> [13511.214409][    C3]   SET = 0, FnV = 0
> [13511.214414][    C3]   EA = 0, S1PTW = 0
> [13511.214417][    C3] Data abort info:
> [13511.214422][    C3]   ISV = 0, ISS = 0x00000044
> [13511.214427][    C3]   CM = 0, WnR = 1
> [13511.214432][    C3] [dead00000000012a] address between user and kernel address ranges
> [13511.214439][    C3] Internal error: Oops: 96000044 [#1] PREEMPT SMP
> [13511.215449][    C3] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G S      W  O      5.10.168-android12-9-o-g63cc297a7b34 #1
> [13511.215454][    C3] Hardware name: Qualcomm Technologies, Inc. Cape MTP, Whiteswan (DT)
> [13511.215460][    C3] pstate: 82400085 (Nzcv daIf +PAN -UAO +TCO BTYPE=--)
> [13511.215472][    C3] pc : expire_timers+0x9c/0x428
> [13511.215478][    C3] lr : __run_timers+0x1f0/0x328
> [13511.215483][    C3] sp : ffffffc00801bdd0
> [13511.215487][    C3] x29: ffffffc00801bdf0 x28: ffffffdb87b31698
> [13511.215493][    C3] x27: ffffffdb87999e58 x26: ffffffdb87966008
> [13511.215499][    C3] x25: 0000000000000001 x24: ffffff8001734a00
> [13511.215506][    C3] x23: 00000000000000e0 x22: dead000000000122
> [13511.215512][    C3] x21: 000000010032658e x20: ffffff89f7a9ae80
> [13511.215518][    C3] x19: ffffffc00801be50 x18: ffffffc00801d038
> [13511.215525][    C3] x17: 0000000000000240 x16: 0000000000000201
> [13511.215532][    C3] x15: ffffffffffffffff x14: ffffff89f7a9aef8
> [13511.215538][    C3] x13: 0000000000000240 x12: ffffff89f7a9aea8
> [13511.215544][    C3] x11: 0000000000000021 x10: 000000014032658e
> [13511.215550][    C3] x9 : ffffffc00801be50 x8 : dead000000000122
> [13511.215556][    C3] x7 : ffff71646c68735e x6 : ffffff89f7aaae58
> [13511.215563][    C3] x5 : 0000000000000000 x4 : 0000000000000101
> [13511.215569][    C3] x3 : ffffff89f7a9aef0 x2 : ffffff89f7a9aef0
> [13511.215575][    C3] x1 : ffffffc00801be50 x0 : ffffff8045804428
> [13511.215581][    C3] Call trace:
> [13511.215586][    C3]  expire_timers+0x9c/0x428
> [13511.215591][    C3]  __run_timers+0x1f0/0x328
> [13511.215596][    C3]  run_timer_softirq+0x28/0x58
> [13511.215602][    C3]  efi_header_end+0x168/0x5ec
> [13511.215610][    C3]  __irq_exit_rcu+0x108/0x124
> [13511.215617][    C3]  __handle_domain_irq+0x118/0x1e4
> [13511.215625][    C3]  gic_handle_irq.31230+0x6c/0x250
> [13511.215630][    C3]  el1_irq+0xe4/0x1c0
> [13511.215638][    C3]  cpuidle_enter_state+0x3a4/0xa04
> [13511.215644][    C3]  do_idle+0x308/0x574
> [13511.215649][    C3]  cpu_startup_entry+0x84/0x90
> [13511.215656][    C3]  secondary_start_kernel+0x204/0x274
> [13511.215664][    C3] Code: d503201f a9402408 f9000128 b4000048 (f9000509)
> [13511.215670][    C3] ---[ end trace 5100bad72a35d566 ]---
> [13511.215676][    C3] Kernel panic - not syncing: Oops: Fatal exception in interrupt
> 
> This is because when switching the governor through the sys node,
> the devfreq_monitor_start function will re-initialize the delayed work task,
> which will cause the delay work pending flag to become invalid, and the timer pending judgment contained in the delayed work will also become invalid,
> and then the pending interception will be executed when the queue is executed.
> 
> So we remove the delay work'initialization work to the devfreq_add_device and timer_store functions,
> and the delay work pending judgment is performed before the devfreq_monitor_start function performs the queue operation.
> 
> Signed-off-by: ZaiYang Huang <huangzaiyang@...o.com>
> ---
>   drivers/devfreq/devfreq.c | 36 ++++++++++++++++++++++++------------
>   1 file changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
> index b3a68d5833bd..8ae6f853a21e 100644
> --- a/drivers/devfreq/devfreq.c
> +++ b/drivers/devfreq/devfreq.c
> @@ -483,18 +483,7 @@ void devfreq_monitor_start(struct devfreq *devfreq)
>          if (IS_SUPPORTED_FLAG(devfreq->governor->flags, IRQ_DRIVEN))
>                  return;
> 
> -       switch (devfreq->profile->timer) {
> -       case DEVFREQ_TIMER_DEFERRABLE:
> -               INIT_DEFERRABLE_WORK(&devfreq->work, devfreq_monitor);
> -               break;
> -       case DEVFREQ_TIMER_DELAYED:
> -               INIT_DELAYED_WORK(&devfreq->work, devfreq_monitor);
> -               break;
> -       default:
> -               return;
> -       }
> -
> -       if (devfreq->profile->polling_ms)
> +       if (devfreq->profile->polling_ms && !delayed_work_pending(&devfreq->work))
>                  queue_delayed_work(devfreq_wq, &devfreq->work,
>                          msecs_to_jiffies(devfreq->profile->polling_ms));
>   }
> @@ -830,6 +819,17 @@ struct devfreq *devfreq_add_device(struct device *dev,
>                  goto err_dev;
>          }
> 
> +       switch (devfreq->profile->timer) {
> +       case DEVFREQ_TIMER_DEFERRABLE:
> +               INIT_DEFERRABLE_WORK(&devfreq->work, devfreq_monitor);
> +               break;
> +       case DEVFREQ_TIMER_DELAYED:
> +               INIT_DELAYED_WORK(&devfreq->work, devfreq_monitor);
> +               break;
> +       default:
> +               dev_err(dev, "%s: Target devfreq(%s)'s profile timer has no settings \n", devfreq->governor_name,
> +                       __func__);
> +       }

[..]

>          if (!devfreq->profile->max_state || !devfreq->profile->freq_table) {
>                  mutex_unlock(&devfreq->lock);
>                  err = set_freq_table(devfreq);
> @@ -1860,6 +1860,18 @@ static ssize_t timer_store(struct device *dev, struct device_attribute *attr,
>          df->profile->timer = timer;
>          mutex_unlock(&df->lock);
> 
> +       switch (df->profile->timer) {
> +       case DEVFREQ_TIMER_DEFERRABLE:
> +               INIT_DEFERRABLE_WORK(&df->work, devfreq_monitor);
> +               break;
> +       case DEVFREQ_TIMER_DELAYED:
> +               INIT_DELAYED_WORK(&df->work, devfreq_monitor);
> +               break;
> +       default:
> +               dev_err(dev, "%s: Target devfreq(%s)'s profile timer has no settings \n", df->governor_name,
> +                       __func__);
> +       }
> +
Here, this can cause issue right, as it is modifying the delayed work
data even before stopping the current running instances..

Should the above thing be done after DEVFREQ_GOV_STOP ?
But again it will boil down to the same thing as it is currently
now .
>          ret = df->governor->event_handler(df, DEVFREQ_GOV_STOP, NULL);
>          if (ret) {
>                  dev_warn(dev, "%s: Governor %s not stopped(%d)\n",

I was thinking if i put mutex around cancel_delayed_work_sync() as well 
in devfreq_monitor_stop() on top of below patch, should cover the 
corruption of workdata.

https://lore.kernel.org/all/1699957648-31299-1-git-send-email-quic_mojha@quicinc.com/

-Mukesh

> --
> 2.17.1

Remove the entire below thing..>
> ________________________________
> OPPO
> 
> ±¾µç×ÓÓʼþ¼°Æ丽¼þº¬ÓÐOPPO¹«Ë¾µÄ±£ÃÜÐÅÏ¢£¬½öÏÞÓÚÓʼþÖ¸Ã÷µÄÊÕ¼þÈË£¨°üº¬¸öÈ˼°Èº×飩ʹÓ᣽ûÖ¹ÈκÎÈËÔÚδ¾­ÊÚȨµÄÇé¿öÏÂÒÔÈκÎÐÎʽʹÓá£Èç¹ûÄú´íÊÕÁ˱¾Óʼþ£¬ÇÐÎð´«²¥¡¢·Ö·¢¡¢¸´ÖÆ¡¢Ó¡Ë¢»òʹÓñ¾ÓʼþÖ®Èκβ¿·Ö»òÆäËùÔØÖ®ÈκÎÄÚÈÝ£¬²¢ÇëÁ¢¼´ÒÔµç×ÓÓʼþ֪ͨ·¢¼þÈ˲¢É¾³ý±¾Óʼþ¼°Æ丽¼þ¡£
> ÍøÂçͨѶ¹ÌÓÐȱÏÝ¿ÉÄܵ¼ÖÂÓʼþ±»½ØÁô¡¢Ð޸ġ¢¶ªÊ§¡¢ÆÆ»µ»ò°üº¬¼ÆËã»ú²¡¶¾µÈ²»°²È«Çé¿ö£¬OPPO¶Ô´ËÀà´íÎó»òÒÅ©¶øÒýÖÂÖ®ÈκÎËðʧ¸Å²»³Ðµ£ÔðÈβ¢±£ÁôÓë±¾ÓʼþÏà¹ØÖ®Ò»ÇÐȨÀû¡£
> ³ý·ÇÃ÷ȷ˵Ã÷£¬±¾Óʼþ¼°Æ丽¼þÎÞÒâ×÷ΪÔÚÈκιú¼Ò»òµØÇøÖ®ÒªÔ¼¡¢ÕÐÀ¿»ò³Ðŵ£¬ÒàÎÞÒâ×÷ΪÈκν»Ò×»òºÏ֮ͬÕýʽȷÈÏ¡£ ·¢¼þÈË¡¢ÆäËùÊô»ú¹¹»òËùÊô»ú¹¹Ö®¹ØÁª»ú¹¹»òÈκÎÉÏÊö»ú¹¹Ö®¹É¶«¡¢¶­Ê¡¢¸ß¼¶¹ÜÀíÈËÔ±¡¢Ô±¹¤»òÆäËûÈκÎÈË£¨ÒÔϳơ°·¢¼þÈË¡±»ò¡°OPPO¡±£©²»Òò±¾ÓʼþÖ®ÎóËͶø·ÅÆúÆäËùÏíÖ®ÈκÎȨÀû£¬Ò಻¶ÔÒò¹ÊÒâ»ò¹ýʧʹÓøõÈÐÅÏ¢¶øÒý·¢»ò¿ÉÄÜÒý·¢µÄËðʧ³Ðµ£ÈκÎÔðÈΡ£
> ÎÄ»¯²îÒìÅû¶£ºÒòÈ«ÇòÎÄ»¯²îÒìÓ°Ï죬µ¥´¿ÒÔYES\OK»òÆäËû¼òµ¥´Ê»ãµÄ»Ø¸´²¢²»¹¹³É·¢¼þÈ˶ÔÈκν»Ò×»òºÏ֮ͬÕýʽȷÈÏ»ò½ÓÊÜ£¬ÇëÓë·¢¼þÈËÔÙ´ÎÈ·ÈÏÒÔ»ñµÃÃ÷È·ÊéÃæÒâ¼û¡£·¢¼þÈ˲»¶ÔÈκÎÊÜÎÄ»¯²îÒìÓ°Ïì¶øµ¼Ö¹ÊÒâ»ò´íÎóʹÓøõÈÐÅÏ¢ËùÔì³ÉµÄÈκÎÖ±½Ó»ò¼ä½ÓË𺦳е£ÔðÈΡ£
> This e-mail and its attachments contain confidential information from OPPO, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you are not the intended recipient, please do not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
> Electronic communications may contain computer viruses or other defects inherently, may not be accurately and/or timely transmitted to other systems, or may be intercepted, modified ,delayed, deleted or interfered. OPPO shall not be liable for any damages that arise or may arise from such matter and reserves all rights in connection with the email.
> Unless expressly stated, this e-mail and its attachments are provided without any warranty, acceptance or promise of any kind in any country or region, nor constitute a formal confirmation or acceptance of any transaction or contract. The sender, together with its affiliates or any shareholder, director, officer, employee or any other person of any such institution (hereinafter referred to as "sender" or "OPPO") does not waive any rights and shall not be liable for any damages that arise or may arise from the intentional or negligent use of such information.
> Cultural Differences Disclosure: Due to global cultural differences, any reply with only YES\OK or other simple words does not constitute any confirmation or acceptance of any transaction or contract, please confirm with the sender again to ensure clear opinion in written form. The sender shall not be responsible for any direct or indirect damages resulting from the intentional or misuse of such information.
> ________________________________
> OPPO
> 
> 本电子邮件及其附件含有OPPO公司的保密信息,仅限于邮件指明的收件人(包含个人及群组)使用。禁止任何人在未经授权的情况下以任何形式使用。如果您错收了本邮件,切勿传播、分发、复制、印刷或使用本邮件之任何部分或其所载之任何内容,并请立即以电子邮件通知发件人并删除本邮件及其附件。
> 网络通讯固有缺陷可能导致邮件被截留、修改、丢失、破坏或包含计算机病毒等不安全情况,OPPO对此类错误或遗漏而引致之任何损失概不承担责任并保留与本邮件相关之一切权利。
> 除非明确说明,本邮件及其附件无意作为在任何国家或地区之要约、招揽或承诺,亦无意作为任何交易或合同之正式确认。 发件人、其所属机构或所属机构之关联机构或任何上述机构之股东、董事、高级管理人员、员工或其他任何人(以下称“发件人”或“OPPO”)不因本邮件之误送而放弃其所享之任何权利,亦不对因故意或过失使用该等信息而引发或可能引发的损失承担任何责任。
> 文化差异披露:因全球文化差异影响,单纯以YES\OK或其他简单词汇的回复并不构成发件人对任何交易或合同之正式确认或接受,请与发件人再次确认以获得明确书面意见。发件人不对任何受文化差异影响而导致故意或错误使用该等信息所造成的任何直接或间接损害承担责任。
> This e-mail and its attachments contain confidential information from OPPO, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you are not the intended recipient, please do not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
> Electronic communications may contain computer viruses or other defects inherently, may not be accurately and/or timely transmitted to other systems, or may be intercepted, modified ,delayed, deleted or interfered. OPPO shall not be liable for any damages that arise or may arise from such matter and reserves all rights in connection with the email.
> Unless expressly stated, this e-mail and its attachments are provided without any warranty, acceptance or promise of any kind in any country or region, nor constitute a formal confirmation or acceptance of any transaction or contract. The sender, together with its affiliates or any shareholder, director, officer, employee or any other person of any such institution (hereinafter referred to as "sender" or "OPPO") does not waive any rights and shall not be liable for any damages that arise or may arise from the intentional or negligent use of such information.
> Cultural Differences Disclosure: Due to global cultural differences, any reply with only YES\OK or other simple words does not constitute any confirmation or acceptance of any transaction or contract, please confirm with the sender again to ensure clear opinion in written form. The sender shall not be responsible for any direct or indirect damages resulting from the intentional or misuse of such information.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ