lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 26 Jan 2015 13:39:33 +0100
From:	Vlastimil Ĺ etka <setka@...s.cz>
To:	vbridger@...nsource.altera.com, netdev@...r.kernel.org
Subject: Altera TSE (altera_tse) net_rx_action WARNING - polling bug in altera_tse_main.c?

Hello,

I am using Altera TSE kernel driver (altera_tse module) on Altera 
socfpga platform (Cyclone V SoC with ARM Cortex-A9) and I probably 
discovered a bug in it. I have two TSE controllers instantiated in FPGA 
- my FPGA HW design is based on this tutorial: 
http://www.rocketboards.org/foswiki/Projects/AlteraSoCTripleSpeedEthernetDesignExample

The kernel version is 3.10.37-ltsi with RT patch, from 
http://rocketboards.org/gitweb/?p=linux-socfpga.git;a=commit;h=7ea94617cfae6a62ee963adc1ae340196dbe2b34
with backported some altera_tse fixes from current 3.19-rc5.

I was not able to get TSE ethernets working on vanilla 3.19-rc5, 
probably because of some changes around interrupts and devicetree, but 
it's another story.

After some time (minutes to hours) of exhaustive traffic generated by 
iperf through altera_tse ethernet, I can see a kernel warning on console 
like this:

------------[ cut here ]------------
WARNING: at net/core/dev.c:4255 net_rx_action+0x268/0x28c()
Modules linked in: gpio_altera altera_sysid altera_tse
CPU: 0 PID: 5885 Comm: irq/75-eth2 Not tainted 
3.10.37-ltsi-rt37-vs-2-1-00062-g861955e #1
[<800166c4>] (unwind_backtrace+0x0/0x100) from [<80012edc>] 
(show_stack+0x20/0x24)
[<80012edc>] (show_stack+0x20/0x24) from [<80503404>] (dump_stack+0x24/0x28)
[<80503404>] (dump_stack+0x24/0x28) from [<8002303c>] 
(warn_slowpath_common+0x64/0x7c)
[<8002303c>] (warn_slowpath_common+0x64/0x7c) from [<80023110>] 
(warn_slowpath_null+0x2c/0x34)
[<80023110>] (warn_slowpath_null+0x2c/0x34) from [<80404d48>] 
(net_rx_action+0x268/0x28c)
[<80404d48>] (net_rx_action+0x268/0x28c) from [<8002bd18>] 
(do_current_softirqs+0x1e4/0x388)
[<8002bd18>] (do_current_softirqs+0x1e4/0x388) from [<8002bf34>] 
(local_bh_enable+0x78/0x90)
[<8002bf34>] (local_bh_enable+0x78/0x90) from [<80086c9c>] 
(irq_forced_thread_fn+0x50/0x74)
[<80086c9c>] (irq_forced_thread_fn+0x50/0x74) from [<80086fbc>] 
(irq_thread+0x16c/0x1c8)
[<80086fbc>] (irq_thread+0x16c/0x1c8) from [<80048104>] (kthread+0xb4/0xb8)
[<80048104>] (kthread+0xb4/0xb8) from [<8000e718>] (ret_from_fork+0x14/0x20)
---[ end trace 0000000000000002 ]---

The warning point is:

     WARN_ON_ONCE(work > weight);

at 
http://rocketboards.org/gitweb/?p=linux-socfpga.git;a=blob;f=net/core/dev.c;h=2193b5dc276ad6aa54adb1ee15ef3de625915fcd;hb=7ea94617cfae6a62ee963adc1ae340196dbe2b34#l4255

After a warning, interface is still working without problems.

I am not much familiar with Linux network stack and device drivers. But 
I probably found a root cause in:

    # drivers/net/ethernet/altera/altera_tse_main.c.
    # 
http://rocketboards.org/gitweb/?p=linux-socfpga.git;a=blob;f=drivers/net/ethernet/altera/altera_tse_main.c;h=07c0b193c55722d18ff2723f0a7e137671746ba1;hb=7ea94617cfae6a62ee963adc1ae340196dbe2b34#l368

    static int tse_rx(struct altera_tse_private *priv, int limit)

the `limit` parameter is not used anywhere in the function! When 
`tse_rx` is called from `tse_poll` it can return more frames than limit, 
which in the end triggers the kernel warning as I think:

   # drivers/net/ethernet/altera/altera_tse_main.c
   # 
http://rocketboards.org/gitweb/?p=linux-socfpga.git;a=blob;f=drivers/net/ethernet/altera/altera_tse_main.c;h=07c0b193c55722d18ff2723f0a7e137671746ba1;hb=7ea94617cfae6a62ee963adc1ae340196dbe2b34#l488

   static int tse_poll(struct napi_struct *napi, int budget)
   {
            ...
            txcomplete = tse_tx_complete(priv);
            rxcomplete = tse_rx(priv, budget);
            if (rxcomplete >= budget || txcomplete > 0)
                    return rxcomplete;

Condition `if (rxcomplete >= budget || txcomplete > 0) return 
rxcomplete;` is also very weird for me. I am not sure if it's buggy, but 
I think it should be at least commented how it works.

Vlastimil Setka
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ