linux-kernel - Re: [PATCH 0/9] staging: octeon: multi rx group (queue) support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <17cd0837-03aa-fa05-bcdb-f2fd64c121ad@skyportsystems.com>
Date:   Wed, 31 Aug 2016 18:52:24 -0700
From:   Ed Swierk <eswierk@...portsystems.com>
To:     Aaro Koskinen <aaro.koskinen@....fi>,
        David Daney <ddaney@...iumnetworks.com>
Cc:     driverdev-devel <devel@...verdev.osuosl.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 0/9] staging: octeon: multi rx group (queue) support

On 8/31/16 14:20, Aaro Koskinen wrote:
> On Wed, Aug 31, 2016 at 09:20:07AM -0700, Ed Swierk wrote:
>> Here's my workaround:
> 
> [...]
> 
>> -static int cvm_oct_poll(struct oct_rx_group *rx_group, int budget)
>> +static int cvm_oct_poll(int group, int budget)
>>  {
>>  	const int	coreid = cvmx_get_core_num();
>>  	u64	old_group_mask;
>> @@ -181,13 +181,13 @@ static int cvm_oct_poll(struct oct_rx_group *rx_group, int budget)
>>  	if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
>>  		old_group_mask = cvmx_read_csr(CVMX_SSO_PPX_GRP_MSK(coreid));
>>  		cvmx_write_csr(CVMX_SSO_PPX_GRP_MSK(coreid),
>> -			       BIT(rx_group->group));
>> +			       BIT(group));
>> @@ -447,7 +447,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int budget)
>>  						     napi);
>>  	int rx_count;
>>
>> -	rx_count = cvm_oct_poll(rx_group, budget);
>> +	rx_count = cvm_oct_poll(rx_group->group, budget);
> 
> I'm confused - there should be no difference?!

I can't figure out the difference either. I get a crash within the first
couple packets, while with the workaround I can't get it to crash at all.
It always bombs in netif_receive_skb(), which isn't very close to any
rx_group pointer dereference.

# ping 172.16.100.253
PING 172.16.100.253 (172.16.100.253): 56 data bytes
Data bus error, epc == ffffffff803fd4ac, ra == ffffffff801943d8
Oops[#1]:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.19+ #94
task: ffffffff80863e80 ti: ffffffff80840000 task.ti: ffffffff80840000
$ 0   : 0000000000000000 ffffffff80126078 ef7bdef7bdef7bdf ffffffff815d3860
$ 4   : ffffffff80e045c8 ffffffff81aae950 ffffffff81aae950 0000000000000000
$ 8   : ffffffff81aae950 0000000000000038 0000000000000070 0000000003bf0000
$12   : 0000000054000000 0000000003bd0000 0000000000000000 0000000000000000
$16   : ffffffff81aae950 ffffffff81aae950 ffffffff80e045c8 0000000000000000
$20   : 00000000000000fa 0000000000000001 000000005c02e0fa 0000000000000000
$24   : 0000000000000062 ffffffff80548468                                  
$28   : ffffffff80840000 ffffffff808436d0 ffffffff80feba38 ffffffff801943d8
Hi    : 0000000000000000
Lo    : 05198e3760c00000
epc   : ffffffff803fd4ac __list_add_rcu+0x7c/0xa0
ra    : ffffffff801943d8 __lock_acquire+0xd94/0x1bf0
Status: 10008ce2	KX SX UX KERNEL EXL 
Cause : 40808c1c (ExcCode 07)
PrId  : 000d910a (Cavium Octeon II)
Modules linked in:
Process swapper/0 (pid: 0, threadinfo=ffffffff80840000, task=ffffffff80863e80, tls=0000000000000000)
Stack : ffffffff80863e80 ffffffff808646c8 ffffffff81aae950 ffffffff801943d8
	  00000000000000fa ffffffff808646c0 0000000000000000 0000000000000002
	  0000000000000000 ffffffff8057ab90 ffffffff80864690 ffffffff80870990
	  0000000000000001 0000000000000000 0000000000000000 0000000000000017
	  0000000000000000 ffffffff80193e08 0000000000000017 ffffffff80864688
	  0000000000000001 ffffffff8057ab90 ffffffff808a7d28 800000007f4b7500
	  800000007a0b52e8 0000000000000001 ffffffff807f0000 800000007f768068
	  ffffffff8085fac8 ffffffff8019568c 0000000000000000 0000000000000000
	  ffffffff808a7d10 ffffffff80645e60 800000007f4a8600 0000000000000254
	  ffffffff808a7d58 ffffffff8057ab90 0000000000000008 800000007f7680a0
	  ...
Call Trace:
[<__list_add_rcu at list_debug.c:97 (discriminator 2)>] __list_add_rcu+0x7c/0xa0
[<inc_chains at lockdep.c:1683
 (inlined by) lookup_chain_cache at lockdep.c:2096
 (inlined by) validate_chain at lockdep.c:2115
 (inlined by) __lock_acquire at lockdep.c:3206>] __lock_acquire+0xd94/0x1bf0
[<lock_acquire at lockdep.c:3587>] lock_acquire+0x50/0x78
[<__raw_read_lock at rwlock_api_smp.h:150
 (inlined by) _raw_read_lock at spinlock.c:223>] _raw_read_lock+0x4c/0x90
[<hlist_empty at list.h:611
 (inlined by) raw_v4_input at raw.c:177
 (inlined by) raw_local_deliver at raw.c:216>] raw_local_deliver+0x58/0x1e8
[<ip_local_deliver_finish at ip_input.c:205>] ip_local_deliver_finish+0x118/0x4a8
[<NF_HOOK_THRESH at netfilter.h:226
 (inlined by) NF_HOOK at netfilter.h:249
 (inlined by) ip_local_deliver at ip_input.c:257>] ip_local_deliver+0x68/0xe0
[<NF_HOOK_THRESH at ip_input.c:467
 (inlined by) NF_HOOK at netfilter.h:249
 (inlined by) ip_rcv at ip_input.c:455>] ip_rcv+0x398/0x478
[<__netif_receive_skb_core at dev.c:3948>] __netif_receive_skb_core+0x764/0x818
[<rcu_read_unlock at rcupdate.h:913
 (inlined by) netif_receive_skb_internal at dev.c:4012>] netif_receive_skb_internal+0x148/0x214
[<cvm_oct_poll at ethernet-rx.c:379
 (inlined by) cvm_oct_napi_poll at ethernet-rx.c:452>] cvm_oct_napi_poll+0x790/0xa2c
[<napi_poll at dev.c:4804
 (inlined by) net_rx_action at dev.c:4869>] net_rx_action+0x130/0x2e0
[<preempt_count at preempt.h:10
 (inlined by) __do_softirq at softirq.c:275>] __do_softirq+0x1f0/0x318
[<do_softirq_own_stack at interrupt.h:449
 (inlined by) invoke_softirq at softirq.c:357
 (inlined by) irq_exit at softirq.c:391>] irq_exit+0x64/0xcc
[<octeon_irq_ciu2 at octeon-irq.c:1951>] octeon_irq_ciu2+0x154/0x1c4
[<plat_irq_dispatch at octeon-irq.c:2319>] plat_irq_dispatch+0x70/0x108
[<?? at entry.S:35>] ret_from_irq+0x0/0x4
[<?? at genex.S:132>] __r4k_wait+0x20/0x40
[<arch_local_save_flags at irqflags.h:149
 (inlined by) cpuidle_idle_call at idle.c:196
 (inlined by) cpu_idle_loop at idle.c:251
 (inlined by) cpu_startup_entry at idle.c:299>] cpu_startup_entry+0x154/0x1d0
[<start_kernel at main.c:684>] start_kernel+0x538/0x554

Presumably there's some sort of race condition that my change doesn't
really fix but happens to avoid by dereferencing rx_group just once early
on?

--Ed