linux-kernel - spinlock BUG in qdisc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <89cb5ede0803230023h49563614h57cad4a25d7753a4@mail.gmail.com>
Date:	Sun, 23 Mar 2008 10:23:20 +0300
From:	"Khaled Al-Hamwi" <khaled.linux@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: spinlock BUG in qdisc_restart

Hi list,

I am doing experimental work on the bcm5700 network driver. I am using
3Com NICs. My machine has two NICs. The machine is simply forwarding
the incoming packets from one NIC to the other one.
I want to process all the incoming packets at the interrupt level and
not using softirqs. I changed the call sequence accordingly in the
bcm5700 driver.

I got the following bug and the system hanged (I got this from
/var/log/messages):

BUG: spinlock trylock failure on UP on CPU#0, swapper/0
lock: d11f014c, .magic: dead4ead, .owner: swapper/0, .owner_cpu: 0
[<c01bec44>] _raw_spin_trylock+0x37/0x3b
[<c02d6343>] _spin_trylock+0x5/0xe
[<c0291117>] qdisc_restart+0x3c/0x1b5
[<c0284038>] dev_queue_xmit+0xf2/0x207
[<c029fb25>] ip_output+0x1cc/0x224
[<c029e1a7>] ip_forward+0x383/0x3e2
[<c029cf2f>] ip_rcv+0x38e/0x3ea
[<c02845af>] netif_receive_skb+0x1fc/0x23e
[<e033b560>] MM_IndicateRxPackets+0x2be/0x379 [bcm5700]
[<e0342c26>] LM_ServiceInterrupts+0xac/0xc5 [bcm5700]
[<e0337768>] bcm5700_interrupt+0x13c/0x2bd [bcm5700]
[<c0135ce2>] handle_IRQ_event+0x23/0x4c
[<c0135d85>] __do_IRQ+0x7a/0xcd
[<c0103eea>] do_IRQ+0x5c/0x77
=======================
[<c0102d6a>] common_interrupt+0x1a/0x20
[<c02d007b>] unix_sock_destructor+0x4a/0xb3
[<c02d6402>] _spin_unlock_irqrestore+0xa/0xc

I tried different things to solve this issue like using tasklets for
transmitting packets instead of transmitting them directly. Every time
I get a different bug or kernel panic :(
I tried also one suggestion from the mailing list of using spin_trylock_irqsave
and spin_trylock_irqrestore. But that also did not solve the problem.

The problem is that function qdisc_restart is being preempted and
called again on the same CPU. I have only one CPU core and I am using
kernel 2.6.15.
It seems to me that the interrupts initiated by the NIC driver are the
reason for this. But, From the driver code, I can see that the
interrupts are disabled before calling MM_IndicateRxPackets. This
function is responsible for processing the packets and calling
netif_receive_skb for further processing (look at the trace). I got
also this error:
Dead loop on netdevice eth0, fix it urgently!

Is there any other source for preempting qdisc_restart other than
hadware interrupts from the NIC? Is it possible that having two NICs
in this setup is causing this problem? Any ideas or suggestions are
appreciated.

Please, CC me when replying to this message.

Thanks,
Khaled
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/