[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20191101232828.17023-1-ivan.khoronzhuk@linaro.org>
Date: Sat, 2 Nov 2019 01:28:28 +0200
From: Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org>
To: davem@...emloft.net, vinicius.gomes@...el.com
Cc: jhs@...atatu.com, xiyou.wangcong@...il.com, jiri@...nulli.us,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org>
Subject: [PATCH net] taprio: fix panic while hw offload sched list swap
Don't swap oper and admin schedules too early, it's not correct and
causes crash.
Steps to reproduce:
1)
tc qdisc replace dev eth0 parent root handle 100 taprio \
num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 1@2 \
base-time $SOME_BASE_TIME \
sched-entry S 01 80000 \
sched-entry S 02 15000 \
sched-entry S 04 40000 \
flags 2
2)
tc qdisc replace dev eth0 parent root handle 100 taprio \
base-time $SOME_BASE_TIME \
sched-entry S 01 90000 \
sched-entry S 02 20000 \
sched-entry S 04 40000 \
flags 2
3)
tc qdisc replace dev eth0 parent root handle 100 taprio \
base-time $SOME_BASE_TIME \
sched-entry S 01 150000 \
sched-entry S 02 200000 \
sched-entry S 04 40000 \
flags 2
Do 2 3 2 .. steps more times if not happens and observe:
[ 305.832319] Unable to handle kernel write to read-only memory at
virtual address ffff0000087ce7f0
[ 305.910887] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
[ 305.919306] Hardware name: Texas Instruments AM654 Base Board (DT)
[...]
[ 306.017119] x1 : ffff800848031d88 x0 : ffff800848031d80
[ 306.022422] Call trace:
[ 306.024866] taprio_free_sched_cb+0x4c/0x98
[ 306.029040] rcu_process_callbacks+0x25c/0x410
[ 306.033476] __do_softirq+0x10c/0x208
[ 306.037132] irq_exit+0xb8/0xc8
[ 306.040267] __handle_domain_irq+0x64/0xb8
[ 306.044352] gic_handle_irq+0x7c/0x178
[ 306.048092] el1_irq+0xb0/0x128
[ 306.051227] arch_cpu_idle+0x10/0x18
[ 306.054795] do_idle+0x120/0x138
[ 306.058015] cpu_startup_entry+0x20/0x28
[ 306.061931] rest_init+0xcc/0xd8
[ 306.065154] start_kernel+0x3bc/0x3e4
[ 306.068810] Code: f2fbd5b7 f2fbd5b6 d503201f f9400422 (f9000662)
[ 306.074900] ---[ end trace 96c8e2284a9d9d6e ]---
[ 306.079507] Kernel panic - not syncing: Fatal exception in interrupt
[ 306.085847] SMP: stopping secondary CPUs
[ 306.089765] Kernel Offset: disabled
Try to explain one of the possible crash cases:
The "real" admin list is assigned when admin_sched is set to
new_admin, it happens after "swap", that assigns to oper_sched NULL.
Thus if call qdisc show it can crash.
Farther, next second time, when sched list is updated, the admin_sched
is not NULL and becomes the oper_sched, previous oper_sched was NULL so
just skipped. But then admin_sched is assigned new_admin, but schedules
to free previous assigned admin_sched (that already became oper_sched).
Farther, next third time, when sched list is updated,
while one more swap, oper_sched is not null, but it was happy to be
freed already (while prev. admin update), so while try to free
oper_sched the kernel panic happens at taprio_free_sched_cb().
So, move the "swap emulation" where it should be according to function
comment from code.
Fixes: 9c66d15646760e ("taprio: Add support for hardware offloading")
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org>
---
Based on net/master
net/sched/sch_taprio.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index 6719a65169d4..286ad4bb50f1 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1224,8 +1224,6 @@ static int taprio_enable_offload(struct net_device *dev,
goto done;
}
- taprio_offload_config_changed(q);
-
done:
taprio_offload_free(offload);
@@ -1505,6 +1503,9 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
call_rcu(&admin->rcu, taprio_free_sched_cb);
spin_unlock_irqrestore(&q->current_entry_lock, flags);
+
+ if (FULL_OFFLOAD_IS_ENABLED(taprio_flags))
+ taprio_offload_config_changed(q);
}
new_admin = NULL;
--
2.20.1
Powered by blists - more mailing lists