lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <6afb936f-17c7-43fa-90e0-b9e780866097@app.fastmail.com>
Date: Wed, 13 Nov 2024 19:17:45 +0100
From: "Arnd Bergmann" <arnd@...db.de>
To: "John Stultz" <jstultz@...gle.com>, LKML <linux-kernel@...r.kernel.org>
Cc: "Peter Zijlstra" <peterz@...radead.org>,
 "Joel Fernandes" <joelaf@...gle.com>,
 "Qais Yousef" <qyousef@...alina.io>, "Ingo Molnar" <mingo@...hat.com>,
 "Juri Lelli" <juri.lelli@...hat.com>,
 "Vincent Guittot" <vincent.guittot@...aro.org>,
 "Dietmar Eggemann" <dietmar.eggemann@....com>,
 "Valentin Schneider" <vschneid@...hat.com>,
 "Steven Rostedt" <rostedt@...dmis.org>,
 "Benjamin Segall" <bsegall@...gle.com>,
 "Zimuzo Ezeozue" <zezeozue@...gle.com>, "Mel Gorman" <mgorman@...e.de>,
 "Will Deacon" <will@...nel.org>, "Waiman Long" <longman@...hat.com>,
 "Boqun Feng" <boqun.feng@...il.com>,
 "Paul E. McKenney" <paulmck@...nel.org>,
 "Metin Kaya" <Metin.Kaya@....com>, "Xuewen Yan" <xuewen.yan94@...il.com>,
 "K Prateek Nayak" <kprateek.nayak@....com>,
 "Thomas Gleixner" <tglx@...utronix.de>,
 "Daniel Lezcano" <daniel.lezcano@...aro.org>, kernel-team@...roid.com,
 "Davidlohr Bueso" <dave@...olabs.net>, regressions@...ts.linux.dev,
 "Thorsten Leemhuis" <linux@...mhuis.info>,
 "Anders Roxell" <anders.roxell@...aro.org>
Subject: Re: [PATCH v13 1/7] locking/mutex: Remove wakeups from under mutex::wait_lock

On Sat, Oct 12, 2024, at 01:25, John Stultz wrote:
> From: Peter Zijlstra <peterz@...radead.org>
>
> In preparation to nest mutex::wait_lock under rq::lock we need
> to remove wakeups from under it.
>
> Do this by utilizing wake_qs to defer the wakeup until after the
> lock is dropped.

To follow up from IRC, this patch is the one that caused a
boot time regression in linux-next in the regulator framework.

Anders Roxell found this during testing on the Rock Pi 4 board
(rockchips rk3399 based).

The book load with the NULL pointer dereference is at
https://lkft.validation.linaro.org/scheduler/job/7979980#L741

The interesting bit is this:

[ 0.957586] rk_gmac-dwmac fe300000.ethernet: Enable RX Mitigation via HW Watchdog Timer"}
[ 0.969402] hub 6-0:1.0: USB hub found"}
[ 0.969450] hub 6-0:1.0: 1 port detected"}
[ 0.988163] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000"}
[ 0.988172] Mem abort info:"}
[ 0.988174]   ESR = 0x0000000096000004"}
[ 0.988176]   EC = 0x25: DABT (current EL), IL = 32 bits"}
[ 0.988180]   SET = 0, FnV = 0"}
[ 0.988183]   EA = 0, S1PTW = 0"}
[ 0.988185]   FSC = 0x04: level 0 translation fault"}
[ 0.988187] Data abort info:"}
[ 0.988189]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000"}
[ 0.988191]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0"}
[ 0.988194]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0"}
[ 0.988197] [0000000000000000] user address but active_mm is swapper"}
[ 0.988201] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP"}
[ 0.988205] Modules linked in:"}
[ 0.988217] Hardware name: Radxa ROCK Pi 4B (DT)"}
[ 0.988225] pc : wake_up_q (kernel/sched/core.c:1059)
[ 0.988238] lr : wake_up_q (kernel/sched/core.c:1054)
[ 0.988243] sp : ffff800083433a00"}
[ 0.988245] x29: ffff800083433a00 x28: 0000000000000000 x27: ffff0000053b6080"}
[ 0.988253] x26: ffff800083433b90 x25: ffff0000053b6000 x24: ffff800080098000"}
[ 0.988259] x23: 00000000ffffffff x22: 0000000000000001 x21: 0000000000000000"}
[ 0.988265] x20: fffffffffffff850 x19: 0000000000000000 x18: 0000000000000001"}
[ 0.988272] x17: ffff800075678000 x16: ffff800082728000 x15: 019ee6ab98006e30"}
[ 0.988278] x14: 000002ce459acd0c x13: 000b52b4cf08772c x12: 000000000000000f"}
[ 0.988284] x11: 0000000000000000 x10: 0000000000000a50 x9 : ffff800083433870"}
[ 0.988291] x8 : ffff0000050fceb0 x7 : ffff0000f76ab9c0 x6 : 00000000000b52b4"}
[ 0.988297] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000"}
[ 0.988303] x2 : 0000000000002710 x1 : 0000000000000001 x0 : 0000000000002710"}
[ 0.988310] Call trace:"}
[ 0.988313] wake_up_q+0x50/0xf0 P)"}
[ 0.988319] wake_up_q+0xa0/0xf0 L)"}
[ 0.988325] __ww_rt_mutex_lock.isra.0 (arch/arm64/include/asm/preempt.h:62 (discriminator 2) kernel/locking/rtmutex.c:1794 kernel/locking/ww_rt_mutex.c:71)
[ 0.988333] ww_mutex_lock (kernel/locking/ww_rt_mutex.c:82)
[ 0.988338] regulator_lock_recursive (drivers/regulator/core.c:161 drivers/regulator/core.c:333)
[ 0.988347] regulator_lock_recursive (drivers/regulator/core.c:348)
[ 0.988354] regulator_lock_dependent (drivers/regulator/core.c:409)
[ 0.988360] regulator_set_voltage (drivers/regulator/core.c:4173)
[ 0.988366] _opp_config_regulator_single (include/linux/regulator/consumer.h:707 (discriminator 1) drivers/opp/core.c:933 drivers/opp/core.c:1019)
[ 0.988375] _set_opp (drivers/opp/core.c:1253)
[ 0.988379] dev_pm_opp_set_rate (drivers/opp/core.c:1357)
[ 0.988384] set_target (drivers/cpufreq/cpufreq-dt.c:63)
[ 0.988392] __cpufreq_driver_target (drivers/cpufreq/cpufreq.c:2292 drivers/cpufreq/cpufreq.c:2355)
[ 0.988398] sugov_work (kernel/sched/cpufreq_schedutil.c:537)
[ 0.988406] kthread_worker_fn (arch/arm64/include/asm/jump_label.h:32 include/linux/freezer.h:36 include/linux/freezer.h:54 kernel/kthread.c:861)
[ 0.988414] kthread (kernel/kthread.c:389)
[ 0.988420] ret_from_fork (arch/arm64/kernel/entry.S:863)
[ 0.988430] Code: f100067f 54000320 d11ec274 aa1303f5 (f9400273) "}


> @@ -1776,8 +1785,11 @@ static int __sched rt_mutex_slowlock(struct 
> rt_mutex_base *lock,
>  	 * irqsave/restore variants.
>  	 */
>  	raw_spin_lock_irqsave(&lock->wait_lock, flags);
> -	ret = __rt_mutex_slowlock_locked(lock, ww_ctx, state);
> +	ret = __rt_mutex_slowlock_locked(lock, ww_ctx, state, &wake_q);
> +	preempt_disable();
>  	raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
> +	wake_up_q(&wake_q);
> +	preempt_enable();
>  	rt_mutex_post_schedule();
> 
>  	return ret;

This is apparently where things went wrong, but it's possible that
the actual root cause is in the regulator framework instead.

The NULL pointer itself happens when chasing wake_q->first->next,
so it would seem that one of these got reinitialized at
the wrong time, perhaps with a task_struct getting freed
or getting put on more than one wake_q_head lists at the
same time.

       Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ