[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190411111342.GA29053@splinter>
Date: Thu, 11 Apr 2019 14:13:42 +0300
From: Ido Schimmel <idosch@...sch.org>
To: Vlad Buslov <vladbu@...lanox.com>
Cc: netdev@...r.kernel.org, jhs@...atatu.com, xiyou.wangcong@...il.com,
jiri@...nulli.us, davem@...emloft.net, john.hurley@...ronome.com
Subject: Re: [PATCH net-next] net: sched: flower: insert filter to ht before
offloading it to hw
On Fri, Apr 05, 2019 at 08:56:26PM +0300, Vlad Buslov wrote:
> John reports:
>
> Recent refactoring of fl_change aims to use the classifier spinlock to
> avoid the need for rtnl lock. In doing so, the fl_hw_replace_filer()
> function was moved to before the lock is taken. This can create problems
> for drivers if duplicate filters are created (commmon in ovs tc offload
> due to filters being triggered by user-space matches).
>
> Drivers registered for such filters will now receive multiple copies of
> the same rule, each with a different cookie value. This means that the
> drivers would need to do a full match field lookup to determine
> duplicates, repeating work that will happen in flower __fl_lookup().
> Currently, drivers do not expect to receive duplicate filters.
>
> To fix this, verify that filter with same key is not present in flower
> classifier hash table and insert the new filter to the flower hash table
> before offloading it to hardware. Implement helper function
> fl_ht_insert_unique() to atomically verify/insert a filter.
>
> This change makes filter visible to fast path at the beginning of
> fl_change() function, which means it can no longer be freed directly in
> case of error. Refactor fl_change() error handling code to deallocate the
> filter with rcu timeout.
>
> Fixes: 620da4860827 ("net: sched: flower: refactor fl_change")
> Reported-by: John Hurley <john.hurley@...ronome.com>
> Signed-off-by: Vlad Buslov <vladbu@...lanox.com>
Vlad,
Our regression machines all hit a NULL pointer dereference [1] which I
bisected to this patch. Created this reproducer that you can use:
ip netns add ns1
ip -n ns1 link add dev dummy1 type dummy
tc -n ns1 qdisc add dev dummy1 clsact
tc -n ns1 filter add dev dummy1 ingress pref 1 proto ip \
flower skip_hw src_ip 192.0.2.1 action drop
ip netns del ns1
Can you please look into this? Thanks
[1]
[ 5.332176] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
[ 5.334372] #PF error: [normal kernel read fault]
[ 5.335619] PGD 0 P4D 0
[ 5.336360] Oops: 0000 [#1] SMP
[ 5.337249] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.1.0-rc4-custom-01473-g526bb57a6ad6 #1374
[ 5.339232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
[ 5.341982] Workqueue: netns cleanup_net
[ 5.342843] RIP: 0010:__fl_put+0x24/0xb0
[ 5.343808] Code: 84 00 00 00 00 00 3e ff 8f f0 03 00 00 0f 88 da 7b 14 00 74 01 c3 80 bf f4 03 00 00 00 0f 84 83 00 00 00 4c 8b 87 c8 01 00 00 <41> 8b 50 04 49 8d 70 04
85 d2 74 60 8d 4a 01 39 ca 7f 52 81 fa fe
[ 5.348099] RSP: 0018:ffffabe300663be0 EFLAGS: 00010202
[ 5.349223] RAX: ffff9ea4ba1aff00 RBX: ffff9ea4b99af400 RCX: ffffabe300663c67
[ 5.350572] RDX: 00000000000004c5 RSI: 0000000000000000 RDI: ffff9ea4b99af400
[ 5.351919] RBP: ffff9ea4ba28e900 R08: 0000000000000000 R09: ffffffff9d1b0075
[ 5.353272] R10: ffffeb2884e66b80 R11: ffffffff9dc4dcd8 R12: ffff9ea4b99af408
[ 5.354635] R13: ffff9ea4b99ae400 R14: ffff9ea4b9a47800 R15: ffff9ea4b99ae000
[ 5.355988] FS: 0000000000000000(0000) GS:ffff9ea4bba00000(0000) knlGS:0000000000000000
[ 5.357436] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.358530] CR2: 0000000000000004 CR3: 00000001398fa004 CR4: 0000000000160ef0
[ 5.359876] Call Trace:
[ 5.360360] __fl_delete+0x223/0x3b0
[ 5.361008] fl_destroy+0xb4/0x130
[ 5.361641] tcf_proto_destroy+0x15/0x40
[ 5.362429] tcf_chain_flush+0x4e/0x60
[ 5.363125] __tcf_block_put+0xb4/0x150
[ 5.363805] clsact_destroy+0x30/0x40
[ 5.364507] qdisc_destroy+0x44/0x110
[ 5.365218] dev_shutdown+0x6e/0xa0
[ 5.365821] rollback_registered_many+0x25d/0x510
[ 5.366724] ? netdev_run_todo+0x221/0x280
[ 5.367485] unregister_netdevice_many+0x15/0xa0
[ 5.368355] default_device_exit_batch+0x13f/0x170
[ 5.369268] ? wait_woken+0x80/0x80
[ 5.369910] cleanup_net+0x19a/0x280
[ 5.370558] process_one_work+0x1f5/0x3f0
[ 5.371326] worker_thread+0x28/0x3c0
[ 5.372038] ? process_one_work+0x3f0/0x3f0
[ 5.372755] kthread+0x10d/0x130
[ 5.373358] ? __kthread_create_on_node+0x180/0x180
[ 5.374298] ret_from_fork+0x35/0x40
[ 5.374934] CR2: 0000000000000004
[ 5.375454] ---[ end trace c20e7f74127772e5 ]---
[ 5.376284] RIP: 0010:__fl_put+0x24/0xb0
[ 5.377003] Code: 84 00 00 00 00 00 3e ff 8f f0 03 00 00 0f 88 da 7b 14 00 74 01 c3 80 bf f4 03 00 00 00 0f 84 83 00 00 00 4c 8b 87 c8 01 00 00 <41> 8b 50 04 49 8d 70 04
85 d2 74 60 8d 4a 01 39 ca 7f 52 81 fa fe
[ 5.380269] RSP: 0018:ffffabe300663be0 EFLAGS: 00010202
[ 5.381237] RAX: ffff9ea4ba1aff00 RBX: ffff9ea4b99af400 RCX: ffffabe300663c67
[ 5.382551] RDX: 00000000000004c5 RSI: 0000000000000000 RDI: ffff9ea4b99af400
[ 5.383972] RBP: ffff9ea4ba28e900 R08: 0000000000000000 R09: ffffffff9d1b0075
[ 5.385314] R10: ffffeb2884e66b80 R11: ffffffff9dc4dcd8 R12: ffff9ea4b99af408
[ 5.386616] R13: ffff9ea4b99ae400 R14: ffff9ea4b9a47800 R15: ffff9ea4b99ae000
[ 5.387986] FS: 0000000000000000(0000) GS:ffff9ea4bba00000(0000) knlGS:0000000000000000
[ 5.389512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.390546] CR2: 0000000000000004 CR3: 00000001398fa004 CR4: 0000000000160ef0
Powered by blists - more mailing lists