lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <12a344e25b31ec00fe8b57814d43fcb166e71be5.camel@mediatek.com>
Date: Fri, 27 Sep 2024 07:00:13 +0000
From: Chris Lu (陸稚泓) <Chris.Lu@...iatek.com>
To: "luiz.dentz@...il.com" <luiz.dentz@...il.com>,
	"regressions@...ts.linux.dev" <regressions@...ts.linux.dev>
CC: "marcel@...tmann.org" <marcel@...tmann.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-bluetooth@...r.kernel.org" <linux-bluetooth@...r.kernel.org>,
	"casteyde.christian@...e.fr" <casteyde.christian@...e.fr>,
	Hao Qin (秦浩) <Hao.Qin@...iatek.com>,
	Aaron Hou (侯俊仰) <Aaron.Hou@...iatek.com>
Subject: Re: [regression] NULL dereference pointer in Bluetooth at boot

Hi Luiz,

On Thu, 2024-09-26 at 10:53 -0400, Luiz Augusto von Dentz wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  Hi,
> 
> On Thu, Sep 26, 2024 at 5:16 AM Linux regression tracking (Thorsten
> Leemhuis) <regressions@...mhuis.info> wrote:
> >
> > Hi, Thorsten here, the Linux kernel's regression tracker.
> >
> > I noticed a report about a regression in bugzilla.kernel.org
> apparently
> > related to the bluetooth code. As many (most?) kernel developers
> don't
> > keep an eye on the bug tracker, I decided to write this mail. To
> quote
> > from https://bugzilla.kernel.org/show_bug.cgi?id=219294 :
> >
> > > Since Kernel 6.11 compiled from vanilla source, I get
> occasionnaly an Oops at boot on my Lenovo Slim 5.
> > > This is a regression.
> > >
> > > Kernel 6.11 / Slackware 64 (Slackware 15 + recent Mesa).
> > > AMD 7840HS 16Go
> > > When the problem occurs, the boot doesn't finish, but I got the
> following in syslog:
> > > Sep 19 19:57:15 latile dnsmasq[924]: no servers found in
> /etc/dnsmasq.d/dnsmasq-resolv.conf, will retry
> > > Sep 20 22:22:29 latile kernel: ACPI BIOS Error (bug): Could not
> resolve symbol [\_SB.PCI0.GP18.SATA], AE_NOT_FOUND
> (20240322/dswload2-162)
> > > Sep 20 22:22:29 latile kernel: ACPI Error: AE_NOT_FOUND, During
> name lookup/catalog (20240322/psobject-220)
> > > Sep 20 22:22:29 latile kernel: ACPI BIOS Error (bug): Failure
> creating named object [\_SB.PCI0.GPP6.WLAN._S0W], AE_ALREADY_EXISTS
> (20240322/dswload2-32
> > > 6)
> > > Sep 20 22:22:29 latile kernel: ACPI Error: AE_ALREADY_EXISTS,
> During name lookup/catalog (20240322/psobject-220)
> > > Sep 20 22:22:31 latile kernel: i8042: PNP: PS/2 appears to have
> AUX port disabled, if this is incorrect please boot with i8042.nopnp
> > > Sep 20 22:22:34 latile kernel: Bluetooth: hci0: HCI Enhanced
> Setup Synchronous Connection command is advertised, but not
> supported.
> > > Sep 20 22:22:37 latile kernel: BUG: kernel NULL pointer
> dereference, address: 0000000000000000
> > > Sep 20 22:22:37 latile kernel: #PF: supervisor read access in
> kernel mode
> > > Sep 20 22:22:37 latile kernel: #PF: error_code(0x0000) - not-
> present page
> > > Sep 20 22:22:37 latile kernel: Oops: Oops: 0000 [#1] PREEMPT SMP
> NOPTI
> > > Sep 20 22:22:37 latile kernel: CPU: 2 UID: 0 PID: 153 Comm:
> kworker/2:1 Not tainted 6.11.0 #1
> > > Sep 20 22:22:37 latile kernel: Hardware name: LENOVO
> 82Y9/LNVNB161216, BIOS M3CN42WW 01/11/2024
> > > Sep 20 22:22:37 latile kernel: Workqueue: pm pm_runtime_work
> > > Sep 20 22:22:37 latile kernel: RIP: 0010:btusb_suspend+0x14/0x1b0
> > > Sep 20 22:22:37 latile kernel: Code: e4 10 00 83 80 d4 0a 00 00
> 01 eb db 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 41
> 54 55 53 48 8b 9f
> > > c8 00 00 00 <48> 8b 13 8b 82 bc 09 00 00 03 82 b8 09 00 00 03 82
> c4 09 00 00 03
> > > Sep 20 22:22:37 latile kernel: RSP: 0018:ffffbf1280b67ca0 EFLAGS:
> 00010206
> > > Sep 20 22:22:37 latile kernel: RAX: ffffffffa62de3b0 RBX:
> 0000000000000000 RCX: 0000000000000002
> > > Sep 20 22:22:37 latile kernel: RDX: 0000000000000003 RSI:
> 0000000000000402 RDI: ffff9bcc85e17000
> > > Sep 20 22:22:37 latile kernel: RBP: ffff9bcc85e17000 R08:
> ffff9bcc8930e800 R09: ffff9bcc85e174b0
> > > Sep 20 22:22:37 latile kernel: R10: 0000000000000003 R11:
> 0000000000000063 R12: 0000000000000402
> > > Sep 20 22:22:37 latile kernel: R13: 0000000000000003 R14:
> 0000000000000000 R15: ffff9bcc8930e800
> > > Sep 20 22:22:37 latile kernel: FS:  0000000000000000(0000)
> GS:ffff9bcfae480000(0000) knlGS:0000000000000000
> > > Sep 20 22:22:37 latile kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> > > Sep 20 22:22:37 latile kernel: CR2: 0000000000000000 CR3:
> 000000035f82a000 CR4: 0000000000750ef0
> > > Sep 20 22:22:37 latile kernel: PKRU: 55555554
> > > Sep 20 22:22:37 latile kernel: Call Trace:
> > > Sep 20 22:22:37 latile kernel:  <TASK>
> > > Sep 20 22:22:37 latile kernel:  ? __die+0x23/0x70
> > > Sep 20 22:22:37 latile kernel:  ? page_fault_oops+0x159/0x520
> > > Sep 20 22:22:37 latile kernel:  ? exc_page_fault+0x404/0x740
> > > Sep 20 22:22:37 latile kernel:  ? asm_exc_page_fault+0x26/0x30
> > > Sep 20 22:22:37 latile kernel:  ?
> btusb_isoc_tx_complete+0x60/0x60
> > > Sep 20 22:22:37 latile kernel:  ? btusb_suspend+0x14/0x1b0
> > > Sep 20 22:22:37 latile kernel:  usb_suspend_both+0x94/0x280
> > > Sep 20 22:22:37 latile kernel:  usb_runtime_suspend+0x2e/0x70
> > > Sep 20 22:22:37 latile kernel:  ? usb_autoresume_device+0x50/0x50
> > > Sep 20 22:22:37 latile kernel:  __rpm_callback+0x41/0x170
> > > Sep 20 22:22:37 latile kernel:  ? usb_autoresume_device+0x50/0x50
> > > Sep 20 22:22:37 latile kernel:  rpm_callback+0x55/0x60
> > > Sep 20 22:22:37 latile kernel:  ? usb_autoresume_device+0x50/0x50
> > > Sep 20 22:22:37 latile kernel:  rpm_suspend+0xe8/0x5e0
> > > Sep 20 22:22:37 latile kernel:  ?
> srso_alias_return_thunk+0x5/0xfbef5
> > > Sep 20 22:22:37 latile last message buffered 1 times
> > > Sep 20 22:22:37 latile kernel:  ?
> finish_task_switch.isra.0+0x96/0x2a0
> > > Sep 20 22:22:37 latile kernel:  __pm_runtime_suspend+0x3c/0xd0
> > > Sep 20 22:22:37 latile kernel:  ? usb_runtime_resume+0x20/0x20
> > > Sep 20 22:22:37 latile kernel:  usb_runtime_idle+0x35/0x40
> > > Sep 20 22:22:37 latile kernel:  rpm_idle+0xbd/0x270
> > > Sep 20 22:22:37 latile kernel:  pm_runtime_work+0x84/0xb0
> > > Sep 20 22:22:37 latile kernel:  process_one_work+0x16d/0x380
> > > Sep 20 22:22:37 latile kernel:  worker_thread+0x2cb/0x3e0
> > > Sep 20 22:22:37 latile kernel:  ?
> _raw_spin_lock_irqsave+0x1b/0x50
> > > Sep 20 22:22:37 latile kernel:  ?
> cancel_delayed_work_sync+0x80/0x80
> > > Sep 20 22:22:37 latile kernel:  kthread+0xde/0x110
> > > Sep 20 22:22:37 latile kernel:  ? kthread_park+0x90/0x90
> > > Sep 20 22:22:37 latile kernel:  ret_from_fork+0x31/0x50
> > > Sep 20 22:22:37 latile kernel:  ? kthread_park+0x90/0x90
> > > Sep 20 22:22:37 latile kernel:  ret_from_fork_asm+0x11/0x20
> > > Sep 20 22:22:37 latile kernel:  </TASK>
> > > Sep 20 22:22:37 latile kernel: Modules linked in:
> > > Sep 20 22:22:37 latile kernel: CR2: 0000000000000000
> > > Sep 20 22:22:37 latile kernel: ---[ end trace 0000000000000000 ]-
> --
> > > Sep 20 22:22:37 latile kernel: RIP: 0010:btusb_suspend+0x14/0x1b0
> > > Sep 20 22:22:37 latile kernel: Code: e4 10 00 83 80 d4 0a 00 00
> 01 eb db 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 41
> 54 55 53 48 8b 9f c8 00 00 00 <48> 8b 13 8b 82 bc 09 00 00 03 82 b8
> 09 00 00 03 82 c4 09 00 00 03
> > > Sep 20 22:22:37 latile kernel: RSP: 0018:ffffbf1280b67ca0 EFLAGS:
> 00010206
> > > Sep 20 22:22:37 latile kernel: RAX: ffffffffa62de3b0 RBX:
> 0000000000000000 RCX: 0000000000000002
> > > Sep 20 22:22:37 latile kernel: RDX: 0000000000000003 RSI:
> 0000000000000402 RDI: ffff9bcc85e17000
> > > Sep 20 22:22:37 latile kernel: RBP: ffff9bcc85e17000 R08:
> ffff9bcc8930e800 R09: ffff9bcc85e174b0
> > > Sep 20 22:22:37 latile kernel: R10: 0000000000000003 R11:
> 0000000000000063 R12: 0000000000000402
> > > Sep 20 22:22:37 latile kernel: R13: 0000000000000003 R14:
> 0000000000000000 R15: ffff9bcc8930e800
> > > Sep 20 22:22:37 latile kernel: FS:  0000000000000000(0000)
> GS:ffff9bcfae480000(0000) knlGS:0000000000000000
> > > Sep 20 22:22:37 latile kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> > > Sep 20 22:22:37 latile kernel: CR2: 0000000000000000 CR3:
> 000000035f82a000 CR4: 0000000000750ef0
> > > Sep 20 22:22:37 latile kernel: PKRU: 55555554
> > > [...]
> 
> I suspect this has been fixed recently:
> 
> 
https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git/commit/?id=6f3f7e9414834fc4210a2d11ff6172031e98d9ff
> 
> @Chris Lu perhaps you want to check since it seem to be hitting
> something else related to btmtk code:
> 
> Sep 17 21:53:23 latile kernel: Bluetooth: hci0: Execution of wmt
> command timed out
> Sep 17 21:53:23 latile kernel: Bluetooth: hci0: Failed to send wmt
> func ctrl (-110)
> 

From the log, I think this issue is less likely related to the changes
I submitted few days ago. Instead, I believe this issue is caused by
following patch 
https://lore.kernel.org/all/3c3dfe8efc70af04794035537c7c40a52f2266d5.1715109394.git.sean.wang@kernel.org/
which MediaTek also found the problem after applying it. This patch
would occasionally cause setup failed on specific platform +
MT7921.(This patch was accepted between v6.10 to v6.11.)

In Auqust, Our member Hao submitted another change to fix it which
hasn't been accepted yet. Hao's change is necessary. He has verified
this patch is workable.

https://lore.kernel.org/all/20240822052310.25220-1-hao.qin@mediatek.com/
Could you help to review it? Please tell us if MediaTek has to resend
or rebase this change again.

Thanks a lot!
Chris Lu

> 
> > See the ticket for more details and another oops. Reporter is CCed.
> >
> > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker'
> hat)
> > --
> > Everything you wanna know about Linux kernel regression tracking:
> > https://linux-regtracking.leemhuis.info/about/#tldr
> > If I did something stupid, please tell me, as explained on that
> page.
> >
> > P.S.: let me use this mail to also add the report to the list of
> tracked
> > regressions to ensure it's doesn't fall through the cracks:
> >
> > #regzbot introduced: v6.10..v6.11
> > #regzbot from: Christian Casteyde <casteyde.christian@...e.fr>
> > #regzbot duplicate: 
> https://bugzilla.kernel.org/show_bug.cgi?id=219294
> > #regzbot title: net: bluetooth: NULL dereference pointer in
> Bluetooth at
> > boot
> > #regzbot ignore-activity
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ