[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0811202035590.19962@ask.diku.dk>
Date: Thu, 20 Nov 2008 20:48:42 +0100 (CET)
From: Jesper Dangaard Brouer <hawk@...u.dk>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: David Miller <davem@...emloft.net>,
Jesper Dangaard Brouer <jdb@...x.dk>,
netdev <netdev@...r.kernel.org>, linux-kernel@...r.kernel.org,
Robert Olsson <Robert.Olsson@...a.slu.se>
Subject: Regression: Bisected, IRQ and MSI allocations screwed without sparse
irq
Hi Thomas Gleixner,
I have bisected a regression to your commit
3235e936c0cc3589309280b6f59e5096779adae3,
"x86: remove sparse irq from Kconfig".
Its actually not necessary your fault, as your commit simply removes
the config option HAVE_SPARSE_IRQ. This revels the bug / regression
I'm exposted to.
Guess I should bisect again to find the exact faulty commit, but I'm
rather sick of bisecting at the moment, and though you might have a
better idea whats going wrong. I would rather spend my time
performance tuning the multiqueue routing code...
[The regression]:
During my testing of the Sun Neptune based NICs. On kernel 2.6.27 I
get really good performance (900-1200kpps) compared to 2.6.28 (davem
git net-2.6).
The cause of this problem (tracked down together with Robert Olsson)
is that on 2.6.28 I have a lot less IRQs available. It seems max 34
IRQs. Due the reduced number of IRQs the NIU driver cannot get
enough IRQs to the interfaces, and starts to use "IO-APIC" based
IRQs.
On kernel 2.6.28: My eth2 is using 10 IRQs all "PCI-MSI-edge". BUT
my eth3 is using a single IRQ using "IO-APIC-fasteoi" and shared with
the usb driver. That my performance problem on 2.6.28.
[Other related bugs]:
Is that unloading the "niu" driver will give a kernel BUG during
deallocation og MSI interrupts. (See dmesg output below if interested)
(I have attached full bisect history)
Cheers,
Jesper Brouer
--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------
On Wed, 19 Nov 2008, David Miller wrote:
> From: Jesper Dangaard Brouer <hawk@...u.dk>
> Date: Wed, 19 Nov 2008 23:58:12 +0100 (CET)
>
>> Well that was not the real cause of the performance loss. Because
>> on kernel 2.6.27 I get really good performance (900-1200kpps)
>> compared to 2.6.28 (git net-2.6).
>>
>> The cause of this problem (tracked down together with Robert Olsson)
>> is that on 2.6.28 I have a lot less IRQs available. It seems max 34
>> IRQs.
>>
>> Due the reduced number of IRQs the NIU driver cannot get enough IRQs
>> to the interfaces, and starts to use "IO-APIC" based IRQs.
>
> This is almost certainly related to the driver unload bug.
>
> I know you ran into unbuildable/unbootable kernels during a bisect,
> but you really need to track down this regression.
------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:632!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: ehci_hcd bnx2 uhci_hcd zlib_inflate serio_raw hpilo
niu(-)
Pid: 3036, comm: rmmod Not tainted (2.6.27-bisect #5) ProLiant DL380 G5
EIP: 0060:[<c021ecac>] EFLAGS: 00010286 CPU: 2
EIP is at msi_free_irqs+0xdc/0xe0
EAX: f6b8f860 EBX: 00000030 ECX: f7156ba8 EDX: c0420500
ESI: f7156800 EDI: f7156ba8 EBP: f6431eb4 ESP: f6431ea8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmmod (pid: 3036, ti=f6430000 task=f70f9b20 task.ti=f6430000)
Stack:
f7156800 f670c400 f7156800 f6431ebc c021ecb8 f6431ec8 c021ef41 f670c000
f6431edc f809d3f8 f7156800 f80a1ed4 f80a1ed4 f6431ee8 c0219c29 f7156858
f6431ef8 c026b0d4 f7156858 f7156914 f6431f0c c026b197 f80a1ea0 f80a1ed4
Call Trace:
[<c021ecb8>] ? msix_free_all_irqs+0x8/0x10
[<c021ef41>] ? pci_disable_msix+0x31/0x40
[<f809d3f8>] ? niu_pci_remove_one+0x88/0x8a [niu]
[<c0219c29>] ? pci_device_remove+0x19/0x40
[<c026b0d4>] ? __device_release_driver+0x54/0x80
[<c026b197>] ? driver_detach+0x97/0xa0
[<c026a475>] ? bus_remove_driver+0x75/0xa0
[<c026b609>] ? driver_unregister+0x39/0x40
[<c0219e51>] ? pci_unregister_driver+0x21/0x80
[<f809a0ad>] ? niu_exit+0xd/0x10 [niu]
[<c0145d74>] ? sys_delete_module+0x114/0x1d0
[<c016810a>] ? remove_vma+0x3a/0x50
[<c0168c29>] ? do_munmap+0x189/0x1e0
[<c0103229>] ? sysenter_do_call+0x12/0x21
[<c0330000>] ? quirk_disable_msi+0x30/0x50
Code: b7 43 08 8b 53 1c c1 e0 04 01 d0 ba 01 00 00 00 83 c0 0c 89 10 3b 7b
14 75 aa 8b 43 1c e8 3d 92 ef ff eb a0 5b 31 c0 5e 5f 5d c3 <0f> 0b eb fe
55 89 e5 e8 18 ff ff ff 5d c3 8d b6 00 00 00 00 55
EIP: [<c021ecac>] msi_free_irqs+0xdc/0xe0 SS:ESP 0068:f6431ea8
---[ end trace f72de2e283920207 ]---
View attachment "bisect_IO-APIC.txt" of type "TEXT/plain" (32509 bytes)
Powered by blists - more mailing lists