[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2bf9a1c1950941aabc383fd196e5768a@BJMBX01.spreadtrum.com>
Date: Fri, 8 Nov 2019 02:16:18 +0000
From: 黄吕强 (Lvqiang Huang)
<lvqiang.huang@...soc.com>
To: Russell King - ARM Linux admin <linux@...linux.org.uk>
CC: "ebiederm@...ssion.com" <ebiederm@...ssion.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"anshuman.khandual@....com" <anshuman.khandual@....com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"f.fainelli@...il.com" <f.fainelli@...il.com>,
"will@...nel.org" <will@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"'26332949@...com'" <26332949@...com>,
楚恩来 (Enlai Chu) <enlai.chu@...soc.com>
Subject: RE: [PATCH] ARM: check __ex_table in do_bad()
Sorry for not having described it clearly, please let me add some more information.
The kernel log for the scenario
[20461.271374] sysrq: SysRq : Show Blocked State
[20461.271405] task PC stack pid father
[20461.271436] mbox-send-threa D c08cfad8 0 38 2 0x00000000
/*and ignore some logs abort the backtrace dump of some TASK_UNINTERRUPTIBLE tasks */
[20461.273387] fsck.exfat D c08cfad8 0 6221 2276 0x00000000
[20461.273408] Backtrace:
[20461.273430] [<c08cf5d0>] (__schedule) from [<c08cff84>] (schedule+0x90/0xa8)
[20461.273442] r10:ce009ef0 r9:ce009df4 r8:c0d0790c r7:00000082 r6:7fffffff r5:00000000
[20461.273477] r4:ce008000
[20461.273497] [<c08cfef4>] (schedule) from [<c08d2b90>] (schedule_timeout+0x2c/0x26c)
[20461.273509] r4:7fffffff r3:dc8ba693
[20461.273561] Unhandled fault: page domain fault (0x01b) at 0x32848c02
[20461.273576] pgd = d1854000
[20461.273587] [32848c02] *pgd=bb21e835
[20461.273607] Internal error: : 1b [#1] PREEMPT SMP ARM
[20461.278903] CPU: 2 PID: 5917 Comm: watchdog Tainted: G W O 4.4.147+ #1
[20461.278929] task: e9beecc0 task.stack: e30a4000
[20461.278949] PC is at for_each_frame+0x18/0x88
[20461.278965] LR is at vprintk_emit+0x470/0x4ec
The Task A: finally crashed task, PID: 5917 Comm: watchdog, running on CPU 2, dumping backtrace of all UN tasks.
The Task B: TASK_UNINTERRUPTIBLE to TASK_RUNNING when Task A is trying to dump its backtrace.
The first 2 frames dump for task B are ok, see
[20461.273430] [<c08cf5d0>] (__schedule) from [<c08cff84>] (schedule+0x90/0xa8)
[20461.273497] [<c08cfef4>] (schedule) from [<c08d2b90>] (schedule_timeout+0x2c/0x26c)
Then task A crashed:
[20461.273561] Unhandled fault: page domain fault (0x01b) at 0x32848c02
From the RAM dump after kernel crash, we can see Task B had been scheduled to running on CPU 0.
crash_arm> ps 6221
PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 6221 2276 0 cde04880 RU 0.4 17784 13596 fsck.exfat
And the backtrace should changed, which cause the crash of Task A.
crash_arm> bt 6221
PID: 6221 TASK: cde04880 CPU: 0 COMMAND: "fsck.exfat"
#0 [<c0117a5c>] (__kunmap_atomic) from [<c0413ae8>]
#1 [<c0413894>] (copy_page_to_iter) from [<c01f4788>]
#2 [<c01f439c>] (generic_file_read_iter) from [<c02725e8>]
#3 [<c027257c>] (blkdev_read_iter) from [<c023b5b0>]
#4 [<c023b4f8>] (__vfs_read) from [<c023bd04>]
#5 [<c023bc78>] (vfs_read) from [<c023c7e0>]
#6 [<c023c76c>] (sys_pread64) from [<c01079a0>]
This is the race condition, try to backtrace another task is not safe. We can't assume the task won't be scheduled to execution during the backtrace dump. The stack frame should totally change once execute again.
The __ex_table entry in @for_each_frame should adding for this scenario. But with CONFIG_CPU_SW_DOMAIN_PAN=y, page domain fault may hit and go the do_bad() instead of do_page_fault().
The path may not an optimal solution, I just want to point out the problem, and is there any concern if we check __ex_table in do_bad()?
Now, our project had enabled CONFIG_ARM_UNWIND=y, it will fail to get an unwind_idx when get a wrong sv_pc, then the unwind abort without kernel crash.
-----Original Message-----
From: 黄吕强 (Lvqiang Huang)
Sent: Friday, November 08, 2019 1:23 AM
To: Russell King - ARM Linux admin
Cc: ebiederm@...ssion.com; dave.hansen@...ux.intel.com; anshuman.khandual@....com; akpm@...ux-foundation.org; f.fainelli@...il.com; will@...nel.org; tglx@...utronix.de; linux-arm-kernel@...ts.infradead.org; linux-kernel@...r.kernel.org
Subject: Re: [PATCH] ARM: check __ex_table in do_bad()
> 在 2019年11月7日,17:24,Russell King - ARM Linux admin
> <linux@...linux.org.uk> 写道:
>
>> On Thu, Nov 07, 2019 at 03:45:13PM +0800, Lvqiang wrote:
>>
>> We got many crashs in for_each_frame+0x18 arch/arm/lib/backtrace.S
>> 1003: ldr r2, [sv_pc, #-4]
>>
>> The backtrace is
>> dump_backtrace
>> show_stack
>> sched_show_task
>> show_state_filter
>> sysrq_handle_showstate_blocked
>> __handle_sysrq
>> write_sysrq_trigger
>> proc_reg_write
>> __vfs_write
>> vfs_write
>> sys_write
>>
>> Related Kernel config
>> CONFIG_CPU_SW_DOMAIN_PAN=y
>> # CONFIG_ARM_UNWIND is not set
>> CONFIG_FRAME_POINTER=y
>>
>> The task A was dumping the stack of an UN task B. However, the task B
>
> What is "an UN task B"?
UN means TASK_UNINTERRUPTIBLE.
(Sorry for the typo in the last reply)
>> scheduled to run on another CPU, which cause it stack content changed.
>> Then, task A may hit a page domain fault and die().
>> [520.661314] Unhandled fault: page domain fault (0x01b) at
>> 0x32848c02
>
> So, the backtrace code is trying to access userspace. It isn't
> supposed to be accessing userspace - there are no guarantees that
> userspace will be using frame pointers. That is the bug.
>
There is a race condition when try to get the backtrace of another task,whose frames may totally changed during the execution.
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down
> 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up
============================================================================
This email (including its attachments) is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. Unauthorized use, dissemination, distribution or copying of this email or the information herein or taking any action in reliance on the contents of this email or the information herein, by anyone other than the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, is strictly prohibited. If you are not the intended recipient, please do not read, copy, use or disclose any part of this e-mail to others. Please notify the sender immediately and permanently delete this e-mail and any attachments if you received it in error. Internet communications cannot be guaranteed to be timely, secure, error-free or virus-free. The sender does not accept liability for any errors or omissions.
本邮件及其附件具有保密性质,受法律保护不得泄露,仅发送给本邮件所指特定收件人。严禁非经授权使用、宣传、发布或复制本邮件或其内容。若非该特定收件人,请勿阅读、复制、 使用或披露本邮件的任何内容。若误收本邮件,请从系统中永久性删除本邮件及所有附件,并以回复邮件的方式即刻告知发件人。无法保证互联网通信及时、安全、无误或防毒。发件人对任何错漏均不承担责任。
Powered by blists - more mailing lists