[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zxg52Ccb8FF8MCt0@mini-arch>
Date: Tue, 22 Oct 2024 16:48:40 -0700
From: Stanislav Fomichev <stfomichev@...il.com>
To: Gilad Naaman <gnaaman@...venets.com>
Cc: netdev <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Kuniyuki Iwashima <kuniyu@...zon.com>
Subject: Re: [PATCH net-next v7 0/6] neighbour: Improve neigh_flush_dev
performance
On 10/22, Gilad Naaman wrote:
> This patchsets improves the performance of neigh_flush_dev.
>
> Currently, the only way to implement it requires traversing
> all neighbours known to the kernel, across all network-namespaces.
>
> This means that some flows are slowed down as a function of neighbour-scale,
> even if the specific link they're handling has little to no neighbours.
>
> In order to solve this, this patchset adds a netdev->neighbours list,
> as well as making the original linked-list doubly-, so that it is
> possible to unlink neighbours without traversing the hash-bucket to
> obtain the previous neighbour.
>
> The original use-case we encountered was mass-deletion of links (12K
> VLANs) while there are 50K ARPs and 50K NDPs in the system; though the
> slowdowns would also appear when the links are set down.
>
> Changes in v7:
>
> - Fix crash due to use of poisoned hlist_node
> - Apply samx-tree formatting
>
> Gilad Naaman (6):
> neighbour: Add hlist_node to struct neighbour
> neighbour: Define neigh_for_each_in_bucket
> neighbour: Convert seq_file functions to use hlist
> neighbour: Convert iteration to use hlist+macro
> neighbour: Remove bare neighbour::next pointer
> neighbour: Create netdev->neighbour association
>
> .../networking/net_cachelines/net_device.rst | 1 +
> include/linux/netdevice.h | 7 +
> include/net/neighbour.h | 24 +-
> include/net/neighbour_tables.h | 12 +
> net/core/neighbour.c | 337 ++++++++----------
> net/ipv4/arp.c | 2 +-
> 6 files changed, 174 insertions(+), 209 deletions(-)
> create mode 100644 include/net/neighbour_tables.h
Looks like the test is still unhappy. Can you try to run it on your side
before reposting? Or does it look good?
[ 110.442590][ C2] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x5191
[ 110.443219][ C2] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 110.443498][ C2] flags: 0x80000000000040(head|node=0|zone=1)
[ 110.443752][ C2] page_type: f5(slab)
[ 110.443897][ C2] raw: 0080000000000003 ffffea0000146401 ffffffffffffffff 0000000000000000
[ 110.444236][ C2] raw: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
[ 110.444546][ C2] head: 0080000000000040 ffff8880010433c0 ffffea0000256410 ffff8880010410e8
[ 110.444862][ C2] head: 0000000000000000 0000000000020002 00000001f5000000 0000000000000000
[ 110.445175][ C2] head: 0080000000000003 ffffea0000146401 ffffffffffffffff 0000000000000000
[ 110.445890][ C2] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
[ 110.446197][ C2] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
[ 110.446558][ C2] ------------[ cut here ]------------
[ 110.446754][ C2] kernel BUG at include/linux/mm.h:1140!
[ 110.446972][ C2] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 110.447210][ C2] CPU: 2 UID: 0 PID: 29 Comm: ksoftirqd/2 Not tainted 6.12.0-rc3-virtme #1
[ 110.447528][ C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
[ 110.447928][ C2] RIP: 0010:__free_pages+0x1e4/0x220
[ 110.448128][ C2] Code: 0f 94 c3 e9 ba fe ff ff 48 c7 c6 a0 18 58 92 4c 89 e7 e8 df bf f4 ff 90 0f 0b 48 c7 c6 40 29 58 92 4c 89 e7 e8 cd bf f4 ff 90 <0f> 0b 48 89 ef e8 72 03 09 00 e9 c5 fe ff ff e8 98 03 09 00 e9 35
[ 110.448803][ C2] RSP: 0018:ffffc90000217cb0 EFLAGS: 00010246
[ 110.449040][ C2] RAX: 000000000000003e RBX: 0000000000000000 RCX: 1ffffffff263b43c
[ 110.449304][ C2] RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001
[ 110.449565][ C2] RBP: ffffea0000146474 R08: 0000000000000000 R09: fffffbfff263b43c
[ 110.449840][ C2] R10: 0000000000000003 R11: 205d324320202020 R12: ffffea0000146440
[ 110.450101][ C2] R13: ffffc90000217d78 R14: 0000000000000000 R15: 0000000000000008
[ 110.450399][ C2] FS: 0000000000000000(0000) GS:ffff888036100000(0000) knlGS:0000000000000000
[ 110.450787][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 110.451008][ C2] CR2: 00007f9289e72270 CR3: 000000000a73a005 CR4: 0000000000772ef0
[ 110.451274][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 110.451564][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 110.451860][ C2] PKRU: 55555554
[ 110.452015][ C2] Call Trace:
[ 110.452167][ C2] <TASK>
[ 110.452282][ C2] ? die+0x37/0x90
[ 110.452440][ C2] ? do_trap+0x1a3/0x260
[ 110.452589][ C2] ? __free_pages+0x1e4/0x220
[ 110.452786][ C2] ? do_error_trap+0xbe/0x180
[ 110.452970][ C2] ? __free_pages+0x1e4/0x220
[ 110.453152][ C2] ? __free_pages+0x1e4/0x220
[ 110.453342][ C2] ? handle_invalid_op+0x2c/0x40
[ 110.453527][ C2] ? __free_pages+0x1e4/0x220
[ 110.453709][ C2] ? exc_invalid_op+0x30/0x50
[ 110.453934][ C2] ? asm_exc_invalid_op+0x1a/0x20
[ 110.454126][ C2] ? __free_pages+0x1e4/0x220
[ 110.454321][ C2] ? rcu_do_batch+0x34d/0xf20
[ 110.454528][ C2] neigh_hash_free_rcu+0xb7/0xe0
[ 110.454728][ C2] rcu_do_batch+0x34f/0xf20
[ 110.454913][ C2] ? __pfx___lock_release+0x10/0x10
[ 110.455108][ C2] ? __pfx_rcu_do_batch+0x10/0x10
[ 110.455350][ C2] ? lockdep_hardirqs_on_prepare+0x12b/0x410
[ 110.455604][ C2] rcu_core+0x2bd/0x4f0
[ 110.455773][ C2] handle_softirqs+0x1f6/0x5c0
[ 110.455965][ C2] ? __pfx_run_ksoftirqd+0x10/0x10
[ 110.456152][ C2] run_ksoftirqd+0x33/0x60
[ 110.456342][ C2] smpboot_thread_fn+0x306/0x850
[ 110.456533][ C2] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 110.456719][ C2] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 110.456903][ C2] kthread+0x28a/0x350
[ 110.457041][ C2] ? __pfx_kthread+0x10/0x10
[ 110.457369][ C2] ret_from_fork+0x31/0x70
[ 110.457566][ C2] ? __pfx_kthread+0x10/0x10
[ 110.457748][ C2] ret_from_fork_asm+0x1a/0x30
[ 110.457944][ C2] </TASK>
---
pw-bot: cr
Powered by blists - more mailing lists