lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <vbfh8d5jroi.fsf@mellanox.com>
Date:   Fri, 15 Feb 2019 10:02:11 +0000
From:   Vlad Buslov <vladbu@...lanox.com>
To:     Ido Schimmel <idosch@...sch.org>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "jhs@...atatu.com" <jhs@...atatu.com>,
        "xiyou.wangcong@...il.com" <xiyou.wangcong@...il.com>,
        "jiri@...nulli.us" <jiri@...nulli.us>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "ast@...nel.org" <ast@...nel.org>,
        "daniel@...earbox.net" <daniel@...earbox.net>
Subject: Re: [PATCH net-next v4 07/17] net: sched: protect filter_chain list
 with filter_chain_lock mutex


On Thu 14 Feb 2019 at 18:24, Ido Schimmel <idosch@...sch.org> wrote:
> On Mon, Feb 11, 2019 at 10:55:38AM +0200, Vlad Buslov wrote:
>> Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain
>> when accessing filter_chain list, instead of relying on rtnl lock.
>> Dereference filter_chain with tcf_chain_dereference() lockdep macro to
>> verify that all users of chain_list have the lock taken.
>>
>> Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute
>> all necessary code while holding chain lock in order to prevent
>> invalidation of chain_info structure by potential concurrent change. This
>> also serializes calls to tcf_chain0_head_change(), which allows head change
>> callbacks to rely on filter_chain_lock for synchronization instead of rtnl
>> mutex.
>>
>> Signed-off-by: Vlad Buslov <vladbu@...lanox.com>
>> Acked-by: Jiri Pirko <jiri@...lanox.com>
>
> With this sequence [1] I get the following trace [2]. Bisected it to
> this patch. Note that second filter is rejected by the device driver
> (that's the intention). I guess it is not properly removed from the
> filter chain in the error path?
>
> Thanks
>
> [1]
> # tc qdisc add dev swp3 clsact
> # tc filter add dev swp3 ingress pref 1 matchall skip_sw \
> 	action mirred egress mirror dev swp7
> # tc filter add dev swp3 ingress pref 2 matchall skip_sw \
> 	action mirred egress mirror dev swp7
> RTNETLINK answers: File exists
> We have an error talking to the kernel, -1
> # tc qdisc del dev swp3 clsact
>
> [2]
> [   70.545131] kasan: GPF could be caused by NULL-ptr deref or user memory access
> [   70.553394] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
> [   70.560618] CPU: 2 PID: 2268 Comm: tc Not tainted 5.0.0-rc5-custom-02103-g415d39427317 #1304
> [   70.570065] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [   70.580204] RIP: 0010:mall_reoffload+0x14a/0x760
> [   70.585382] Code: c1 0f 85 ba 05 00 00 31 c0 4d 8d 6c 24 34 b9 06 00 00 00 4c 89 ff f3 48 ab 4c 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 14 02 4c 89 e8 83
> e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 bd
> [   70.606382] RSP: 0018:ffff888231faefc0 EFLAGS: 00010207
> [   70.612235] RAX: dffffc0000000000 RBX: 1ffff110463f5dfe RCX: 0000000000000000
> [   70.620220] RDX: 0000000000000006 RSI: 1ffff110463f5e01 RDI: ffff888231faf040
> [   70.628206] RBP: ffff8881ef151a00 R08: 0000000000000000 R09: ffffed10463f5dfa
> [   70.636192] R10: ffffed10463f5dfa R11: 0000000000000003 R12: 0000000000000000
> [   70.644177] R13: 0000000000000034 R14: 0000000000000000 R15: ffff888231faf010
> [   70.652163] FS:  00007f46b5bf0800(0000) GS:ffff888236c00000(0000) knlGS:0000000000000000
> [   70.661218] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   70.667649] CR2: 0000000001d590a8 CR3: 0000000231c3c000 CR4: 00000000001006e0
> [   70.675633] Call Trace:
> [   70.693046]  tcf_block_playback_offloads+0x94/0x230
> [   70.710617]  __tcf_block_cb_unregister+0xf7/0x2d0
> [   70.734143]  mlxsw_sp_setup_tc+0x20f/0x660
> [   70.738739]  tcf_block_offload_unbind+0x22b/0x350
> [   70.748898]  __tcf_block_put+0x203/0x630
> [   70.769700]  tcf_block_put_ext+0x2f/0x40
> [   70.774098]  clsact_destroy+0x7a/0xb0
> [   70.782604]  qdisc_destroy+0x11a/0x5c0
> [   70.786810]  qdisc_put+0x5a/0x70
> [   70.790435]  notify_and_destroy+0x8e/0xa0
> [   70.794933]  qdisc_graft+0xbb7/0xef0
> [   70.809009]  tc_get_qdisc+0x518/0xa30
> [   70.821530]  rtnetlink_rcv_msg+0x397/0xa20
> [   70.835510]  netlink_rcv_skb+0x132/0x380
> [   70.848419]  netlink_unicast+0x4c0/0x690
> [   70.866019]  netlink_sendmsg+0x929/0xe10
> [   70.882134]  sock_sendmsg+0xc8/0x110
> [   70.886144]  ___sys_sendmsg+0x77a/0x8f0
> [   70.924064]  __sys_sendmsg+0xf7/0x250
> [   70.955727]  do_syscall_64+0x14d/0x610

Hi Ido,

Thanks for reporting this.

I looked at the code and problem seems to be matchall classifier
specific. My implementation of unlocked cls API assumes that concurrent
insertions are possible and checks for it when deleting "empty" tp.
Since classifiers don't expose number of elements, the only way to test
this is to do tp->walk() on them and assume that walk callback is called
once per filter on every classifier. In your example new tp is created
for second filter, filter insertion fails, number of elements on newly
created tp is checked with tp->walk() before deleting it. However,
matchall classifier always calls the tp->walk() callback once, even when
it doesn't have a valid filter (in this case with NULL filter pointer).

Possible ways to fix this:

1) Check for NULL filter pointer in tp->walk() callback and ignore it
when counting filters on tp - will work but I don't like it because I
don't think it is ever correct to call tp->walk() callback with NULL
filter pointer.

2) Fix matchall to not call tp->walk() callback with NULL filter
pointers - my preferred simple solution.

3) Extend tp API to have direct way to count filters by implementing
tp->count - requires change to every classifier. Or maybe add it as
optional API that only unlocked classifiers are required to implement
and just delete rtnl lock dependent tp's without checking for concurrent
insertions.

What do you think?

Regards,
Vlad

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ