lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <PA4P194MB10059D2195A387ACD32CA27E86562@PA4P194MB1005.EURP194.PROD.OUTLOOK.COM>
Date: Fri, 1 Nov 2024 12:37:41 +0000
From: Alasdair McWilliam <alasdair.mcwilliam@...look.com>
To: Thorsten Leemhuis <linux@...mhuis.info>,
 Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Cc: Magnus Karlsson <magnus.karlsson@...il.com>,
 "xdp-newbies@...r.kernel.org" <xdp-newbies@...r.kernel.org>,
 Linux kernel regressions list <regressions@...ts.linux.dev>,
 Larysa Zaremba <larysa.zaremba@...el.com>,
 Jacob Keller <jacob.e.keller@...el.com>, netdev <netdev@...r.kernel.org>
Subject: Re: ICE + XSK ZC - page faults on 6.1 LTS when process exits?

Good day,

On 27/09/2024 12:32, Thorsten Leemhuis wrote:

> [CCing a few people that were involved in mainlining the culprit
> (8adbf5a42341f6e ("ice: remove af_xdp_zc_qps bitmap") in case they want
> to provide advice]
> 
> On 13.09.24 17:54, Alasdair McWilliam wrote:
>> On 05/09/2024 13:50, Alasdair McWilliam wrote:
>>
>>>> We've been working recently on somewhat related issues and it looks like
>>>> not every commit from [0] has been backported.
>>>>
>>>> $ git log --oneline v6.1.103..v6.1.104 drivers/net/ethernet/intel/ice/
>>>> 5a80b682e3e1 ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
>>>> 8782f0fcb19d ice: replace synchronize_rcu with synchronize_net
>>>> 15115033f056 ice: don't busy wait for Rx queue disable in ice_qp_dis()
>>>> 3dbc58774e58 ice: respect netif readiness in AF_XDP ZC related ndo's
>>>>
>>>> can you apply the rest of it on top of 6.1.107 and see the result?
>>
>>> The first one I've attempted doesn't apply cleanly to 6.1.107.
>>>
>>> Eg: d59227179949 ("ice: modify error handling when setting XSK pool in
>>> ndo_bpf"). The above looks to have been based on code from around 6.8 or
>>> 6.9 where the makeup of routines like ice_qp_ena() has changed. Looks
>>> like this happened around a292ba981324 ("ice: make ice_vsi_cfg_txq()
>>> static").
>>>
>>> Should I try and apply a292ba981324 as well?
>>
>> I just wondered if there was perhaps any further feedback on the above.
> 
> Hmmm. No reply afaics -- but that's how it is sometimes with
> stable/longterm kernels series, as mainline developers are not required
> to participate in their development.
> 
> Still it would be good to fix the problem. So unless the developers come
> up with plan, it might be best to just revert a62c50545b4d in 6.1.y;
> guess asking Greg to do so might be best way ahead unless some solutions
> comes into sight within a few days.
>

It's been a minute since I've looked at this due to other commitments
but accidentally bumped into the fault again when testing the latest 6.6
LTS for a new feature of our software. (I forgot to revert the commit
for "ice: remove af_xdp_zc_qps bitmap" in our build system.)

This led me to wonder about the current version, and can trigger the
same crash on 6.11.5 [3].

Reverting "ice: remove af_xdp_zc_qps bitmap" [1] in the current mainline
is a little more complicated as commit ebc33a3f8d0a ("ice: improve
updating ice_{t,r}x_ring::xsk_pool") also changes things a little so the
reversion doesn't work cleanly.

I have tweaked everything a little the below patch [2] applies cleanly
to 6.11.5 and 6.12-rc5 and seems to fix the fault.

Thought I'd bubble this up as it's definitely still an issue in the
mainline kernel as of now.

Thanks
Alasdair

[1] Commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd

[2]
https://github.com/OpenSource-THG/kernel-patches/tree/main/2024-11-ice-xskzc-page-fault

[3] 6.11.5 ooops

[  565.069120] BUG: unable to handle page fault for address:
ffffa566707380c4
[  565.069144] #PF: supervisor read access in kernel mode
[  565.069155] #PF: error_code(0x0000) - not-present page
[  565.069167] PGD 100000067 P4D 100000067 PUD 20ef17067 PMD 0
[  565.069183] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
[  565.069195] CPU: 7 UID: 0 PID: 6967 Comm: tlndd.bin Kdump: loaded
Tainted: G            E
6.11.5-1.thg.836e8867d7.241031.135507.el9.x86_64 #1
[  565.069220] Tainted: [E]=UNSIGNED_MODULE
[  565.069228] Hardware name: Supermicro SYS-1028R-TDW/X10DDW-i, BIOS
3.2 12/16/2019
[  565.069241] RIP: 0010:ice_xsk_clean_rx_ring+0x37/0x110 [ice]
[  565.069338] Code: 55 53 48 83 ec 08 44 0f b7 af a4 00 00 00 0f b7 af
a2 00 00 00 66 41 39 ed 74 33 48 89 fb 48 8b 4b 38 41 0f b7 c5 4c 8b 34
c1 <41> f6 46 34 01 75 30 4c 89 f7 41 83 c5 01 e8 f6 0c 7e ce 31 c0 66
[  565.069365] RSP: 0018:ffffa5660f8f36d8 EFLAGS: 00010293
[  565.069375] RAX: 0000000000000000 RBX: ffff8bb105d38600 RCX:
ffff8bb184930000
[  565.069387] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff8bb105d38600
[  565.069400] RBP: 00000000000007ff R08: 000000000000050b R09:
0000000000000000
[  565.069411] R10: ffff8bb10f910000 R11: 0000000000000020 R12:
0000000000000004
[  565.069422] R13: 0000000000000000 R14: ffffa56670738090 R15:
ffff8bb1116b5740
[  565.069434] FS:  00007f677a5d1dc0(0000) GS:ffff8bb85fd80000(0000)
knlGS:0000000000000000
[  565.069447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  565.069457] CR2: ffffa566707380c4 CR3: 0000000120164005 CR4:
00000000001706f0
[  565.069470] Call Trace:
[  565.069480]  <TASK>
[  565.069489]  ? __die+0x20/0x70
[  565.069504]  ? page_fault_oops+0x80/0x150
[  565.069517]  ? exc_page_fault+0xcd/0x170
[  565.069531]  ? asm_exc_page_fault+0x22/0x30
[  565.069546]  ? ice_xsk_clean_rx_ring+0x37/0x110 [ice]
[  565.069598]  ice_clean_rx_ring+0x16e/0x190 [ice]
[  565.069650]  ice_down+0x2f8/0x3c0 [ice]
[  565.069692]  ice_xdp_setup_prog+0x193/0x460 [ice]
[  565.069734]  ice_xdp+0x7a/0xb0 [ice]
[  565.069774]  ? __pfx_ice_xdp+0x10/0x10 [ice]
[  565.069813]  dev_xdp_install+0xc7/0x100
[  565.069829]  dev_xdp_attach+0x205/0x5d0
[  565.069841]  do_setlink+0x7d3/0xc20
[  565.069853]  ? dequeue_skb+0x80/0x4f0
[  565.069866]  ? __nla_validate_parse+0x125/0x1d0
[  565.069880]  __rtnl_newlink+0x4f7/0x630
[  565.069892]  ? __kmalloc_cache_noprof+0x225/0x2b0
[  565.069905]  rtnl_newlink+0x44/0x70
[  565.069915]  rtnetlink_rcv_msg+0x15c/0x410
[  565.069928]  ? avc_has_perm_noaudit+0x67/0xf0
[  565.069943]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[  565.069956]  netlink_rcv_skb+0x57/0x100
[  565.069969]  netlink_unicast+0x246/0x370
[  565.069980]  netlink_sendmsg+0x1f6/0x430
[  565.069991]  ____sys_sendmsg+0x3be/0x3f0
[  565.070003]  ? import_iovec+0x16/0x20
[  565.070015]  ? copy_msghdr_from_user+0x6d/0xa0
[  565.070028]  ___sys_sendmsg+0x88/0xd0
[  565.070038]  ? __memcg_slab_free_hook+0xd5/0x120
[  565.070050]  ? __inode_wait_for_writeback+0x7d/0xf0
[  565.070065]  ? mod_objcg_state+0xc9/0x2f0
[  565.070076]  __sys_sendmsg+0x59/0xa0
[  565.070086]  ? syscall_trace_enter+0xfb/0x190
[  565.070098]  do_syscall_64+0x60/0x180
[  565.070111]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[  565.070126] RIP: 0033:0x7f677ab0f94d
[  565.070136] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 0a 67
f7 ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 5e 67 f7 ff 48
[  565.070164] RSP: 002b:00007ffd1e4f7a60 EFLAGS: 00000293 ORIG_RAX:
000000000000002e
[  565.070178] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
00007f677ab0f94d
[  565.070191] RDX: 0000000000000000 RSI: 000000001d698848 RDI:
000000000000000a
[  565.070203] RBP: 000000001d5350e0 R08: 0000000000000000 R09:
0000000000465f98
[  565.070215] R10: 0000000000200000 R11: 0000000000000293 R12:
000000001d535110
[  565.070227] R13: 000000000051d798 R14: 000000001d698830 R15:
000000001d5384b0
[  565.070240]  </TASK>
[  565.070248] Modules linked in: bonding(E) tls(E) nft_fib_inet(E)
nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E)
nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E)
nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_
defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) ip_set(E) nf_tables(E)
libcrc32c(E) nfnetlink(E) vfat(E) fat(E) intel_rapl_msr(E)
intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E)
intel_powerclamp(E) coretemp(E) kvm_intel(E) ipmi_ssif(
E) kvm(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) rapl(E)
intel_cstate(E) intel_uncore(E) ast(E) i2c_i801(E) pcspkr(E) mei_me(E)
drm_shmem_helper(E) mxm_wmi(E) drm_kms_helper(E) i2c_mux(E) mei(E)
i2c_smbus(E) lpc_ich(E) ioat
dma(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) joydev(E)
ipmi_devintf(E) ipmi_msghandler(E) acpi_pad(E) drm(E) fuse(E) ext4(E)
mbcache(E) jbd2(E) sd_mod(E) sg(E) ice(E) ahci(E) crct10dif_pclmul(E)
crc32_pclmul(E) crc32c_intel(E) lib
ahci(E) polyval_clmulni(E) igb(E) polyval_generic(E) libata(E)
ghash_clmulni_intel(E)
[  565.070304]  i2c_algo_bit(E) dca(E) libie(E) wmi(E) dm_mirror(E)
dm_region_hash(E) dm_log(E) dm_mod(E)
[  565.071430] CR2: ffffa566707380c4


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ