lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aa784d0c-85eb-4e5d-968b-c8f74fa86be6@gin.de>
Date: Fri, 6 Oct 2023 16:03:19 +0200
From: Daniel Klauer <daniel.klauer@....de>
To: Ioana Ciornei <ioana.ciornei@....com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [bug] dpaa2-eth: "Wrong SWA type" and null deref in
 dpaa2_eth_free_tx_fd()

On 04.10.23 17:50, Ioana Ciornei wrote:
> On Wed, Aug 30, 2023 at 07:10:05PM +0200, Daniel Klauer wrote:
>> Hi,
>>
> 
> Hi Daniel,
> 
>> while doing Ethernet tests with raw packet sockets on our custom
>> LX2160A board with Linux v6.1.50 (plus some patches for board support,
>> but none for dpaa2-eth), I noticed the following crash:
>>
> 
> Did you happen to test with any other newer kernel?

Today I tested with v6.5.5, it also shows the same crash:

[  160.013619] fsl_dpaa2_eth dpni.0 eth6: entered promiscuous mode
[  163.100294] fsl_dpaa2_eth dpni.0 eth6: Link is Up - 1Gbps/Full - flow control off
[  163.544566] ------------[ cut here ]------------
[  163.545163] Wrong SWA type
[  163.545188] WARNING: CPU: 12 PID: 0 at drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1148 dpaa2_eth_free_tx_fd+0x3bc/0x3c8 [fsl_dpaa2_eth]
[  163.547145] Modules linked in: marvell tag_dsa xhci_plat_hcd xhci_hcd aes_ce_blk usbcore aes_ce_cipher crct10dif_ce caam_jr ghash_ce gf128mul dwc3 fsl_dpaa2_eth caamhash_desc libaes caamalg_desc sha2_ce crypto_engine sha256_arm64 pcs_lynx mv88e6xxx libdes udc_core sha1_ce ahci_qoriq roles fsl_mc_dpio dp83867 libahci_platform ahci usb_common sha1_generic dpaa2_console sfp xgmac_mdio caam libahci at24 error libata mdio_i2c lm90 qoriq_thermal nvmem_layerscape_sfp
[  163.552482] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G           O       6.5.5-00118-g8e92933b7fa0 #1
[  163.553643] Hardware name: mpxlx2160a (DT)
[  163.554204] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  163.555082] pc : dpaa2_eth_free_tx_fd+0x3bc/0x3c8 [fsl_dpaa2_eth]
[  163.555858] lr : dpaa2_eth_free_tx_fd+0x3bc/0x3c8 [fsl_dpaa2_eth]
[  163.556635] sp : ffff800080d7bd70
[  163.557054] x29: ffff800080d7bd70 x28: 00000020874fbfc2 x27: ffff00200ba1b000
[  163.557958] x26: ffff00200ba60880 x25: 0000000000000001 x24: ffff00200320d800
[  163.558860] x23: ffff00200ba43d70 x22: 0000000000002328 x21: ffff00200ba40880
[  163.559762] x20: 0000000000000000 x19: ffff0020074fbfc2 x18: 0000000000000018
[  163.560665] x17: ffff80263bae3000 x16: ffff800080d78000 x15: fffffffffffee2d0
[  163.561567] x14: ffff800080bfb388 x13: ffff800080bfb3e0 x12: 0000000000000a38
[  163.562469] x11: 0000000000000368 x10: ffff800080c585a0 x9 : ffff800080bfb3e0
[  163.563372] x8 : 00000000ffffefff x7 : ffff800080c533e0 x6 : 00000000000051c0
[  163.564274] x5 : ffff0026bc5b98c8 x4 : 0000000000000000 x3 : 0000000000000027
[  163.565176] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00200080e040
[  163.566078] Call trace:
[  163.566389]  dpaa2_eth_free_tx_fd+0x3bc/0x3c8 [fsl_dpaa2_eth]
[  163.567121]  dpaa2_eth_tx_conf+0x74/0xac [fsl_dpaa2_eth]
[  163.567798]  dpaa2_eth_poll+0xec/0x3e0 [fsl_dpaa2_eth]
[  163.568454]  __napi_poll+0x34/0x184
[  163.568902]  net_rx_action+0x11c/0x258
[  163.569377]  __do_softirq+0x11c/0x284
[  163.569843]  ____do_softirq+0xc/0x14
[  163.570296]  call_on_irq_stack+0x24/0x34
[  163.570793]  do_softirq_own_stack+0x18/0x20
[  163.571322]  irq_exit_rcu+0xd0/0xe8
[  163.571765]  el1_interrupt+0x34/0x60
[  163.572222]  el1h_64_irq_handler+0x14/0x1c
[  163.572742]  el1h_64_irq+0x64/0x68
[  163.573172]  cpuidle_enter_state+0x130/0x2fc
[  163.573712]  cpuidle_enter+0x34/0x48
[  163.574166]  do_idle+0x1c8/0x230
[  163.574578]  cpu_startup_entry+0x20/0x28
[  163.575076]  secondary_start_kernel+0x128/0x148
[  163.575650]  __secondary_switched+0x6c/0x70
[  163.576183] ---[ end trace 0000000000000000 ]---
[  163.576778] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
[  163.577884] Mem abort info:
[  163.578244]   ESR = 0x0000000096000004
[  163.578718]   EC = 0x25: DABT (current EL), IL = 32 bits
[  163.579388]   SET = 0, FnV = 0
[  163.579774]   EA = 0, S1PTW = 0
[  163.580171]   FSC = 0x04: level 0 translation fault
[  163.580787] Data abort info:
[  163.581151]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  163.581842]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  163.582485]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  163.583156] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020874aa000
[  163.583968] [0000000000000028] pgd=0000000000000000, p4d=0000000000000000
[  163.584825] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  163.585616] Modules linked in: marvell tag_dsa xhci_plat_hcd xhci_hcd aes_ce_blk usbcore aes_ce_cipher crct10dif_ce caam_jr ghash_ce gf128mul dwc3 fsl_dpaa2_eth caamhash_desc libaes caamalg_desc sha2_ce crypto_engine sha256_arm64 pcs_lynx mv88e6xxx libdes udc_core sha1_ce ahci_qoriq roles fsl_mc_dpio dp83867 libahci_platform ahci usb_common sha1_generic dpaa2_console sfp xgmac_mdio caam libahci at24 error libata mdio_i2c lm90 qoriq_thermal nvmem_layerscape_sfp
[  163.590941] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G        W  O       6.5.5-00118-g8e92933b7fa0 #1
[  163.592102] Hardware name: mpxlx2160a (DT)
[  163.592662] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  163.593540] pc : dpaa2_eth_free_tx_fd+0xe8/0x3c8 [fsl_dpaa2_eth]
[  163.594305] lr : dpaa2_eth_free_tx_fd+0xc0/0x3c8 [fsl_dpaa2_eth]
[  163.595069] sp : ffff800080d7bd70
[  163.595487] x29: ffff800080d7bd70 x28: 00000020874fbfc2 x27: 0000000000000000
[  163.596390] x26: 0000000000000001 x25: 0000000000000001 x24: ffff00200320d800
[  163.597292] x23: ffff00200ba43d70 x22: 0000000000002328 x21: ffff00200ba40880
[  163.598194] x20: 0000000000000000 x19: ffff0020074fbfc2 x18: 0000000000000018
[  163.599096] x17: ffff80263bae3000 x16: ffff800080d78000 x15: fffffffffffee2d0
[  163.599998] x14: ffff800080bfb388 x13: ffff800080bfb3e0 x12: 0000000000000a38
[  163.600900] x11: 0000000000000368 x10: ffff800080c585a0 x9 : ffff800080bfb3e0
[  163.601803] x8 : 0001000000000000 x7 : ffff002002dae480 x6 : 00000020874fbfc2
[  163.602705] x5 : ffff002002dae480 x4 : 0000000000000000 x3 : 0000000000000000
[  163.603607] x2 : 00000000e7e00000 x1 : 0000000000000001 x0 : 0000000001010101
[  163.604509] Call trace:
[  163.604819]  dpaa2_eth_free_tx_fd+0xe8/0x3c8 [fsl_dpaa2_eth]
[  163.605539]  dpaa2_eth_tx_conf+0x74/0xac [fsl_dpaa2_eth]
[  163.606216]  dpaa2_eth_poll+0xec/0x3e0 [fsl_dpaa2_eth]
[  163.606871]  __napi_poll+0x34/0x184
[  163.607314]  net_rx_action+0x11c/0x258
[  163.607789]  __do_softirq+0x11c/0x284
[  163.608252]  ____do_softirq+0xc/0x14
[  163.608705]  call_on_irq_stack+0x24/0x34
[  163.609201]  do_softirq_own_stack+0x18/0x20
[  163.609730]  irq_exit_rcu+0xd0/0xe8
[  163.610172]  el1_interrupt+0x34/0x60
[  163.610627]  el1h_64_irq_handler+0x14/0x1c
[  163.611147]  el1h_64_irq+0x64/0x68
[  163.611578]  cpuidle_enter_state+0x130/0x2fc
[  163.612117]  cpuidle_enter+0x34/0x48
[  163.612570]  do_idle+0x1c8/0x230
[  163.612981]  cpu_startup_entry+0x20/0x28
[  163.613479]  secondary_start_kernel+0x128/0x148
[  163.614052]  __secondary_switched+0x6c/0x70
[  163.614585] Code: 7100081f 54000de0 7100101f 540000c0 (3940a360) 
[  163.615355] ---[ end trace 0000000000000000 ]---
[  163.615938] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[  163.616803] SMP: stopping secondary CPUs
[  163.617309] Kernel Offset: disabled
[  163.617750] CPU features: 0x40000000,12010000,0800420b
[  163.618399] Memory Limit: none
[  163.618787] ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

> 
>> [   26.290737] Wrong SWA type
>> [   26.290760] WARNING: CPU: 7 PID: 0 at drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1117 dpaa2_eth_free_tx_fd.isra.0+0x36c/0x380 [fsl_dpaa2_eth]
>>
>> followed by
>>
>> [   26.323016] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
>> [   26.324122] Mem abort info:
>> [   26.324475]   ESR = 0x0000000096000004
>> [   26.324948]   EC = 0x25: DABT (current EL), IL = 32 bits
>> [   26.325618]   SET = 0, FnV = 0
>> [   26.326004]   EA = 0, S1PTW = 0
>> [   26.326406]   FSC = 0x04: level 0 translation fault
>> [   26.327021] Data abort info:
>> [   26.327385]   ISV = 0, ISS = 0x00000004
>> [   26.327869]   CM = 0, WnR = 0
>> [   26.328244] user pgtable: 4k pages, 48-bit VAs, pgdp=00000020861cf000
>> [   26.329055] [0000000000000028] pgd=0000000000000000, p4d=0000000000000000
>> [   26.329912] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>> [   26.330702] Modules linked in: tag_dsa marvell mv88e6xxx aes_ce_blk caam_jr aes_ce_cipher caamhash_desc crct10dif_ce ghash_ce fsl_dpaa2_eth caamalg_desc xhci_plat_hcd sha256_generic gf128mul libsha256 libaes xhci_hcd crypto_engine pcs_lynx sha2_ce sha1_ce usbcore libdes sha256_arm64 cfg80211 dp83867 sha1_generic fsl_mc_dpio xgmac_mdio dpaa2_console dwc3 ahci ahci_qoriq udc_core caam libahci_platform roles error libahci usb_common libata at24 lm90 qoriq_thermal nvmem_layerscape_sfp sfp mdio_i2c
>> [   26.336237] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G        W          6.1.50-00121-g10168a070f4d #11
>> [   26.337396] Hardware name: mpxlx2160a (DT)
>> [   26.337956] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>> [   26.338833] pc : dpaa2_eth_free_tx_fd.isra.0+0xd4/0x380 [fsl_dpaa2_eth]
>> [   26.339673] lr : dpaa2_eth_free_tx_fd.isra.0+0xb4/0x380 [fsl_dpaa2_eth]
>> [   26.340512] sp : ffff800008cf3d70
>> [   26.340931] x29: ffff800008cf3d70 x28: ffff002002900000 x27: 0000000000000000
>> [   26.341832] x26: 0000000000000001 x25: 0000000000000001 x24: 0000000000000000
>> [   26.342732] x23: 0000000000002328 x22: ffff002009742728 x21: 00000020884fffc2
>> [   26.343633] x20: ffff002009740840 x19: ffff0020084fffc2 x18: 0000000000000018
>> [   26.344534] x17: ffff8026b3a9a000 x16: ffff800008cf0000 x15: fffffffffffed3f8
>> [   26.345435] x14: 0000000000000000 x13: ffff800008bad028 x12: 0000000000000966
>> [   26.346335] x11: 0000000000000322 x10: ffff800008c09b58 x9 : ffff800008bad028
>> [   26.347236] x8 : 0001000000000000 x7 : ffff0020095e6480 x6 : 00000020884fffc2
>> [   26.348137] x5 : ffff0020095e6480 x4 : 0000000000000000 x3 : 0000000000000000
>> [   26.349037] x2 : 00000000e7e00000 x1 : 0000000000000001 x0 : 0000000049759e0c
>> [   26.349938] Call trace:
>> [   26.350247]  dpaa2_eth_free_tx_fd.isra.0+0xd4/0x380 [fsl_dpaa2_eth]
>> [   26.351044]  dpaa2_eth_tx_conf+0x84/0xc0 [fsl_dpaa2_eth]
>> [   26.351720]  dpaa2_eth_poll+0xec/0x3a4 [fsl_dpaa2_eth]
>> [   26.352375]  __napi_poll+0x34/0x180
>> [   26.352816]  net_rx_action+0x128/0x2b4
>> [   26.353290]  _stext+0x124/0x2a0
>> [   26.353687]  ____do_softirq+0xc/0x14
>> [   26.354139]  call_on_irq_stack+0x24/0x40
>> [   26.354635]  do_softirq_own_stack+0x18/0x2c
>> [   26.355164]  __irq_exit_rcu+0xc4/0xf0
>> [   26.355628]  irq_exit_rcu+0xc/0x14
>> [   26.356059]  el1_interrupt+0x34/0x60
>> [   26.356511]  el1h_64_irq_handler+0x14/0x20
>> [   26.357028]  el1h_64_irq+0x64/0x68
>> [   26.357458]  cpuidle_enter_state+0x12c/0x314
>> [   26.357997]  cpuidle_enter+0x34/0x4c
>> [   26.358450]  do_idle+0x208/0x270
>> [   26.358860]  cpu_startup_entry+0x24/0x30
>> [   26.359356]  secondary_start_kernel+0x128/0x14c
>> [   26.359928]  __secondary_switched+0x64/0x68
>> [   26.360460] Code: 7100081f 54000d00 71000c1f 540000c0 (3940a360)
>> [   26.361228] ---[ end trace 0000000000000000 ]---
>>
>> It happens when receiving big Ethernet frames on a AF_PACKET +
>> SOCK_RAW socket, for example MTU 9000. It does not happen with the
>> standard MTU 1500. It does not happen when just sending.
>>
> 
> Are the transmitted frames also big?

Yes, size 9000 for both sendto() and recvfrom().

> 
>> It's 100% reproducible here, however it seems to depend on the data
>> rate/load: Once it happened after receiving the first 80 frames,
>> another time after the first 300 frames, etc., and if I only send 5
>> frames per second, it does not happen at all.
>>
>> Please let me know if I should provide more info or do more tests. I
>> can provide a test program if needed.
>>
> 
> If you can provide a test program, that would be great. It would help in
> reproducing and debugging the issue on my side.

OK, I've attached a test program, send_and_recv.c, reduced as far as I could get it. If I run it:

ip link set up dev eth6
ip link set mtu 9000 dev eth6
./send_and_recv eth6

it triggers the crash, sometimes very quickly, sometimes it takes a few seconds. In my case eth6 is DPAA2 MAC7, configured for MAC_LINK_TYPE_PHY + SGMII. However the same issue happens with MAC17 with MAC_LINK_TYPE_FIXED + RGMII and with MAC3/MAC4 XFI ports. The link must be up for the test; I just connect it to an external switch with nothing else attached.

Other things I've tried: sending only (and receiving on another machine), receiving only (while sending from another machine), and sending and receiving on separate interfaces on the LX2160A machine, but these cases did not seem to trigger the crash. So it looks like it's only in case both sending and receiving happens on the same interface.

Another detail I noticed: While the null pointer crash always happens, the "Wrong SWA type" warning does not always appear. It shows up only if certain byte values are written into the Ethernet frame payload (see comments in the test program). So perhaps that is a separate issue (and maybe my own fault for using a custom/invalid ethertype value).

> 
> Ioana
> 
> 
View attachment "send_and_recv.c" of type "text/x-csrc" (7200 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ