linux-kernel - Re: [RFC PATCHv2 1/2] coresight: tmc-etf: Fix NULL pointer dereference in tmc_enable_etf_sink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c0e1f99a0a2480dfc8d788bb424d3f08@codeaurora.org>
Date:   Wed, 30 Sep 2020 15:54:48 +0530
From:   Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
To:     Mathieu Poirier <mathieu.poirier@...aro.org>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Mike Leach <mike.leach@...aro.org>, peterz@...radead.org
Cc:     alexander.shishkin@...ux.intel.com, linux-arm-msm@...r.kernel.org,
        coresight@...ts.linaro.org, linux-kernel@...r.kernel.org,
        Stephen Boyd <swboyd@...omium.org>, leo.yan@...aro.org,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [RFC PATCHv2 1/2] coresight: tmc-etf: Fix NULL pointer
 dereference in tmc_enable_etf_sink_perf()

On 2020-09-28 17:07, Sai Prakash Ranjan wrote:
> There was a report of NULL pointer dereference in ETF enable
> path for perf CS mode with PID. It is almost 100% reproducible
> when the process to monitor is something very active such as
> chrome and only with ETF as the sink. Currently in a bid to
> find the pid, the owner is dereferenced via task_pid_nr() call
> in tmc_enable_etf_sink_perf(). With owner being NULL, we get a
> NULL pointer dereference, so check the owner before dereferencing
> it to prevent the system crash.
> 
>  perf record -e cs_etm/@..._etf0/ -N -p <pid>
> 
> Unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000548
> Mem abort info:
>   ESR = 0x96000006
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
> Data abort info:
>   ISV = 0, ISS = 0x00000006
>   CM = 0, WnR = 0
> 
> Call trace:
>  tmc_enable_etf_sink+0xe4/0x280
>  coresight_enable_path+0x168/0x1fc
>  etm_event_start+0x8c/0xf8
>  etm_event_add+0x38/0x54
>  event_sched_in+0x194/0x2ac
>  group_sched_in+0x54/0x12c
>  flexible_sched_in+0xd8/0x120
>  visit_groups_merge+0x100/0x16c
>  ctx_flexible_sched_in+0x50/0x74
>  ctx_sched_in+0xa4/0xa8
>  perf_event_sched_in+0x60/0x6c
>  perf_event_context_sched_in+0x98/0xe0
>  __perf_event_task_sched_in+0x5c/0xd8
>  finish_task_switch+0x184/0x1cc
>  schedule_tail+0x20/0xec
>  ret_from_fork+0x4/0x18
> 

+Peter,

I could reproduce this (without my band-aid patch 100%) even on the 
latest
coresight-next tip which is on 5.9-rc5 with my debian installed on
SDM845 based board.

Hi Peter, sorry to bother you. We observe that the NULL pointer is
propagated from events core code(in the call trace below), is it even
valid for the owner(task) to be NULL?

Reproduction is as simple as below:

perf record -e cs_etm/@..._etf0/ -N -p 1

[   16.411231] Unable to handle kernel NULL pointer dereference at 
virtual address 0000000000000468
[   16.420080] Mem abort info:
[   16.422903]   ESR = 0x96000004
[   16.425988]   EC = 0x25: DABT (current EL), IL = 32 bits
[   16.431345]   SET = 0, FnV = 0
[   16.434429]   EA = 0, S1PTW = 0
[   16.437602] Data abort info:
[   16.440506]   ISV = 0, ISS = 0x00000004
[   16.444377]   CM = 0, WnR = 0
[   16.447372] user pgtable: 4k pages, 48-bit VAs, pgdp=00000001f078c000
[   16.453858] [0000000000000468] pgd=0000000000000000, 
p4d=0000000000000000
[   16.460704] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   16.466323] Modules linked in:
[   16.469409] CPU: 5 PID: 2795 Comm: systemd Not tainted 
5.9.0-rc5-g1aeb4770c2f1-dirty #6
[   16.484046] pstate: 80400085 (Nzcv daIf +PAN -UAO BTYPE=--)
[   16.489668] pc : tmc_enable_etf_sink+0x74/0x2e8
[   16.494237] lr : tmc_enable_etf_sink+0x50/0x2e8
[   16.498807] sp : ffff800010c73b20
[   16.502149] x29: ffff800010c73b20 x28: ffff0001712b0008
[   16.507510] x27: ffff00017c76b308 x26: ffffa1e8a227dc80
[   16.512860] x25: 0000000000000002 x24: ffff00017c766768
[   16.518217] x23: 0000000000000080 x22: ffff000171c192e0
[   16.523575] x21: ffff000173868000 x20: ffff000171c19280
[   16.528934] x19: 0000000000000002 x18: ffffffffffffffff
[   16.534293] x17: 0000000000000000 x16: 0000000000000000
[   16.539652] x15: ffffa1e8a1ec9948 x14: ffff800090c738a7
[   16.545011] x13: ffff800010c738b5 x12: 0000000000000028
[   16.550369] x11: ffffa1e8a1eea000 x10: 0000000000000000
[   16.555728] x9 : 0000000000000000 x8 : 00000aeb00000aeb
[   16.561088] x7 : 003000000000000c x6 : 0000000000000001
[   16.566447] x5 : 0000000000000002 x4 : 0000000000000001
[   16.571805] x3 : 0000000000000000 x2 : 0000000000000001
[   16.577163] x1 : 0000000000000000 x0 : 00000000ffffffff
[   16.582523] Call trace:
[   16.584998]  tmc_enable_etf_sink+0x74/0x2e8
[   16.589219]  coresight_enable_path+0xd8/0x208
[   16.593608]  etm_event_start+0xe8/0x128
[   16.597481]  etm_event_add+0x44/0x60
[   16.601094]  event_sched_in.isra.139+0xd0/0x218
[   16.605664]  merge_sched_in+0x148/0x370
[   16.609536]  visit_groups_merge.constprop.147+0x124/0x490
[   16.614973]  ctx_sched_in+0xc4/0x168
[   16.618575]  perf_event_sched_in+0x6c/0xa8
[   16.622706]  __perf_event_task_sched_in+0x1a0/0x1b0
[   16.627623]  finish_task_switch+0x19c/0x248
[   16.631843]  schedule_tail+0x20/0x120
[   16.635535]  ret_from_fork+0x4/0x1c
[   16.639060] Code: 54000f20 f9400301 b9406680 f9414821 (b9446839)
[   16.645215] ---[ end trace bf238834e81d5892 ]---
[   16.649877] Kernel panic - not syncing: Fatal exception


Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation