[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMArcTV-Qvfd7xA0huCh_dbtr7P4LA+cQ7CpnaBBhdq-tq5fZQ@mail.gmail.com>
Date: Thu, 12 Sep 2019 12:56:19 +0900
From: Taehee Yoo <ap420073@...il.com>
To: David Miller <davem@...emloft.net>
Cc: Netdev <netdev@...r.kernel.org>, j.vosburgh@...il.com,
vfalico@...il.com, Andy Gospodarek <andy@...yhouse.net>,
Jiří Pírko <jiri@...nulli.us>,
sd@...asysnail.net, Roopa Prabhu <roopa@...ulusnetworks.com>,
saeedm@...lanox.com, manishc@...vell.com, rahulv@...vell.com,
kys@...rosoft.com, haiyangz@...rosoft.com, sthemmin@...rosoft.com,
sashal@...nel.org, hare@...e.de, varun@...lsio.com,
ubraun@...ux.ibm.com, kgraul@...ux.ibm.com,
Jay Vosburgh <jay.vosburgh@...onical.com>
Subject: Re: [PATCH net v2 01/11] net: core: limit nested device depth
On Thu, 12 Sep 2019 at 07:32, David Miller <davem@...emloft.net> wrote:
>
Hi David
Thank you for the review!
> From: Taehee Yoo <ap420073@...il.com>
> Date: Sat, 7 Sep 2019 22:45:32 +0900
>
> > Current code doesn't limit the number of nested devices.
> > Nested devices would be handled recursively and this needs huge stack
> > memory. So, unlimited nested devices could make stack overflow.
> ...
> > Splat looks like:
> > [ 140.483124] BUG: looking up invalid subclass: 8
> > [ 140.483505] turning off the locking correctness validator.
>
> The limit here is not stack memory, but a limit in the lockdep
> validator, which can probably be fixed by other means.
>
> This was the feedback I saw given for the previous version of
> this series as well.
I just realized this commit message is not enough for describing the problems.
It looks like that "invalid subclass" makes panic.
But this is not.
The panic is not related to "invalid subclass" lockdep problem.
There are two splats.
1. [ 140.483124] BUG: looking up invalid subclass: 8
2. [ 168.446539] BUG: KASAN: slab-out-of-bounds in __unwind_start+0x71/0x850
[ 168.794493] Rebooting in 5 seconds..
The first message is just a warning message of lockdep because of too deep
lockdep subclasses and it doesn't make any problem and panic.
This message can be ignored right now because other patches of
this patchset avoids this problem using dynamic lockdep key.
The second message is a panic message.
This is stack overflow and it occurs because of too deep nested devices.
The goal of this patch is to fix this stack overflow problem.
I tested with this reproducer commands without lockdep.
ip link add dummy0 type dummy
ip link add link dummy0 name vlan1 type vlan id 1
ip link set vlan1 up
for i in {2..200}
do
let A=$i-1
ip link add name vlan$i link vlan$A type vlan id $i
done
ip link del vlan1 <-- this command is added.
Splat looks like:
[ 181.594092] Thread overran stack, or stack corrupted
[ 181.594726] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 181.595417] CPU: 1 PID: 1402 Comm: ip Not tainted 5.3.0-rc7+ #173
[ 181.596193] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 181.605986] RIP: 0010:stack_depot_fetch+0x10/0x30
[ 181.606588] Code: 00 75 10 48 8b 73 18 48 89 ef 5b 5d e9 59 bf 89
ff 0f 0b e8 02 3f 9d ff eb e9 89 f8 c1 ef 110
[ 181.609820] RSP: 0018:ffff8880cbebedd8 EFLAGS: 00010006
[ 181.610485] RAX: 00000000001fffff RBX: ffff8880cbebfc00 RCX: 0000000000000000
[ 181.611394] RDX: 000000000000001d RSI: ffff8880cbebede0 RDI: 0000000000003ff0
[ 181.612297] RBP: ffffea00032fae00 R08: ffffed101b5a3eab R09: ffffed101b5a3eab
[ 181.613222] R10: 0000000000000001 R11: ffffed101b5a3eaa R12: ffff8880d6115e80
[ 181.614148] R13: ffff8880cbebeac0 R14: ffff8880cbebfc00 R15: ffff8880cbebef80
[ 181.615053] FS: 00007f46140510c0(0000) GS:ffff8880dad00000(0000)
knlGS:0000000000000000
[ 181.616085] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 181.616841] CR2: ffffffffb9a7a0d8 CR3: 00000000bc356003 CR4: 00000000000606e0
[ 181.635748] Call Trace:
[ 181.635996] Modules linked in: 8021q garp stp mrp llc dummy veth
openvswitch nsh nf_conncount nf_nat nf_conntrs
[ 181.637360] CR2: ffffffffb9a7a0d8
[ 181.637670] ---[ end trace f890ce3e5c51ceb4 ]---
[ 181.638096] RIP: 0010:stack_depot_fetch+0x10/0x30
[ 181.638524] Code: 00 75 10 48 8b 73 18 48 89 ef 5b 5d e9 59 bf 89
ff 0f 0b e8 02 3f 9d ff eb e9 89 f8 c1 ef 110
[ 181.805441] RSP: 0018:ffff8880cbebedd8 EFLAGS: 00010006
[ 181.900192] RAX: 00000000001fffff RBX: ffff8880cbebfc00 RCX: 0000000000000000
[ 181.901119] RDX: 000000000000001d RSI: ffff8880cbebede0 RDI: 0000000000003ff0
[ 181.902038] RBP: ffffea00032fae00 R08: ffffed101b5a3eab R09: ffffed101b5a3eab
[ 181.902960] R10: 0000000000000001 R11: ffffed101b5a3eaa R12: ffff8880d6115e80
[ 181.903885] R13: ffff8880cbebeac0 R14: ffff8880cbebfc00 R15: ffff8880cbebef80
[ 181.904825] FS: 00007f46140510c0(0000) GS:ffff8880dad00000(0000)
knlGS:0000000000000000
[ 181.905862] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 181.906604] CR2: ffffffffb9a7a0d8 CR3: 00000000bc356003 CR4: 00000000000606e0
[ 181.907525] Kernel panic - not syncing: Fatal exception
[ 181.908179] Kernel Offset: 0x34000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffff)
[ 181.909176] Rebooting in 5 seconds..
Powered by blists - more mailing lists