lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyhznlGIjio3saic@lzaremba-mobl.ger.corp.intel.com>
Date: Mon, 4 Nov 2024 08:11:26 +0100
From: Larysa Zaremba <larysa.zaremba@...el.com>
To: Alasdair McWilliam <alasdair.mcwilliam@...look.com>
CC: Thorsten Leemhuis <linux@...mhuis.info>, Maciej Fijalkowski
	<maciej.fijalkowski@...el.com>, Magnus Karlsson <magnus.karlsson@...il.com>,
	"xdp-newbies@...r.kernel.org" <xdp-newbies@...r.kernel.org>, "Linux kernel
 regressions list" <regressions@...ts.linux.dev>, Jacob Keller
	<jacob.e.keller@...el.com>, netdev <netdev@...r.kernel.org>
Subject: Re: ICE + XSK ZC - page faults on 6.1 LTS when process exits?

On Fri, Nov 01, 2024 at 12:37:41PM +0000, Alasdair McWilliam wrote:
> Good day,
> 
> On 27/09/2024 12:32, Thorsten Leemhuis wrote:
> 
> > [CCing a few people that were involved in mainlining the culprit
> > (8adbf5a42341f6e ("ice: remove af_xdp_zc_qps bitmap") in case they want
> > to provide advice]
> > 
> > On 13.09.24 17:54, Alasdair McWilliam wrote:
> >> On 05/09/2024 13:50, Alasdair McWilliam wrote:
> >>
> >>>> We've been working recently on somewhat related issues and it looks like
> >>>> not every commit from [0] has been backported.
> >>>>
> >>>> $ git log --oneline v6.1.103..v6.1.104 drivers/net/ethernet/intel/ice/
> >>>> 5a80b682e3e1 ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
> >>>> 8782f0fcb19d ice: replace synchronize_rcu with synchronize_net
> >>>> 15115033f056 ice: don't busy wait for Rx queue disable in ice_qp_dis()
> >>>> 3dbc58774e58 ice: respect netif readiness in AF_XDP ZC related ndo's
> >>>>
> >>>> can you apply the rest of it on top of 6.1.107 and see the result?
> >>
> >>> The first one I've attempted doesn't apply cleanly to 6.1.107.
> >>>
> >>> Eg: d59227179949 ("ice: modify error handling when setting XSK pool in
> >>> ndo_bpf"). The above looks to have been based on code from around 6.8 or
> >>> 6.9 where the makeup of routines like ice_qp_ena() has changed. Looks
> >>> like this happened around a292ba981324 ("ice: make ice_vsi_cfg_txq()
> >>> static").
> >>>
> >>> Should I try and apply a292ba981324 as well?
> >>
> >> I just wondered if there was perhaps any further feedback on the above.
> > 
> > Hmmm. No reply afaics -- but that's how it is sometimes with
> > stable/longterm kernels series, as mainline developers are not required
> > to participate in their development.
> > 
> > Still it would be good to fix the problem. So unless the developers come
> > up with plan, it might be best to just revert a62c50545b4d in 6.1.y;
> > guess asking Greg to do so might be best way ahead unless some solutions
> > comes into sight within a few days.
> >
> 
> It's been a minute since I've looked at this due to other commitments
> but accidentally bumped into the fault again when testing the latest 6.6
> LTS for a new feature of our software. (I forgot to revert the commit
> for "ice: remove af_xdp_zc_qps bitmap" in our build system.)
> 
> This led me to wonder about the current version, and can trigger the
> same crash on 6.11.5 [3].
> 
> Reverting "ice: remove af_xdp_zc_qps bitmap" [1] in the current mainline
> is a little more complicated as commit ebc33a3f8d0a ("ice: improve
> updating ice_{t,r}x_ring::xsk_pool") also changes things a little so the
> reversion doesn't work cleanly.
> 
> I have tweaked everything a little the below patch [2] applies cleanly
> to 6.11.5 and 6.12-rc5 and seems to fix the fault.
> 
> Thought I'd bubble this up as it's definitely still an issue in the
> mainline kernel as of now.
> 
> Thanks
> Alasdair
> 

Hello,
Could you please share the reproduction steps? I will look into this.

Larysa

> [1] Commit adbf5a42341f6ea038d3626cd4437d9f0ad0b2dd
> 
> [2]
> https://github.com/OpenSource-THG/kernel-patches/tree/main/2024-11-ice-xskzc-page-fault
> 
> [3] 6.11.5 ooops
> 
> [  565.069120] BUG: unable to handle page fault for address:
> ffffa566707380c4
> [  565.069144] #PF: supervisor read access in kernel mode
> [  565.069155] #PF: error_code(0x0000) - not-present page
> [  565.069167] PGD 100000067 P4D 100000067 PUD 20ef17067 PMD 0
> [  565.069183] Oops: Oops: 0000 [#1] PREEMPT SMP PTI
> [  565.069195] CPU: 7 UID: 0 PID: 6967 Comm: tlndd.bin Kdump: loaded
> Tainted: G            E
> 6.11.5-1.thg.836e8867d7.241031.135507.el9.x86_64 #1
> [  565.069220] Tainted: [E]=UNSIGNED_MODULE
> [  565.069228] Hardware name: Supermicro SYS-1028R-TDW/X10DDW-i, BIOS
> 3.2 12/16/2019
> [  565.069241] RIP: 0010:ice_xsk_clean_rx_ring+0x37/0x110 [ice]
> [  565.069338] Code: 55 53 48 83 ec 08 44 0f b7 af a4 00 00 00 0f b7 af
> a2 00 00 00 66 41 39 ed 74 33 48 89 fb 48 8b 4b 38 41 0f b7 c5 4c 8b 34
> c1 <41> f6 46 34 01 75 30 4c 89 f7 41 83 c5 01 e8 f6 0c 7e ce 31 c0 66
> [  565.069365] RSP: 0018:ffffa5660f8f36d8 EFLAGS: 00010293
> [  565.069375] RAX: 0000000000000000 RBX: ffff8bb105d38600 RCX:
> ffff8bb184930000
> [  565.069387] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> ffff8bb105d38600
> [  565.069400] RBP: 00000000000007ff R08: 000000000000050b R09:
> 0000000000000000
> [  565.069411] R10: ffff8bb10f910000 R11: 0000000000000020 R12:
> 0000000000000004
> [  565.069422] R13: 0000000000000000 R14: ffffa56670738090 R15:
> ffff8bb1116b5740
> [  565.069434] FS:  00007f677a5d1dc0(0000) GS:ffff8bb85fd80000(0000)
> knlGS:0000000000000000
> [  565.069447] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  565.069457] CR2: ffffa566707380c4 CR3: 0000000120164005 CR4:
> 00000000001706f0
> [  565.069470] Call Trace:
> [  565.069480]  <TASK>
> [  565.069489]  ? __die+0x20/0x70
> [  565.069504]  ? page_fault_oops+0x80/0x150
> [  565.069517]  ? exc_page_fault+0xcd/0x170
> [  565.069531]  ? asm_exc_page_fault+0x22/0x30
> [  565.069546]  ? ice_xsk_clean_rx_ring+0x37/0x110 [ice]
> [  565.069598]  ice_clean_rx_ring+0x16e/0x190 [ice]
> [  565.069650]  ice_down+0x2f8/0x3c0 [ice]
> [  565.069692]  ice_xdp_setup_prog+0x193/0x460 [ice]
> [  565.069734]  ice_xdp+0x7a/0xb0 [ice]
> [  565.069774]  ? __pfx_ice_xdp+0x10/0x10 [ice]
> [  565.069813]  dev_xdp_install+0xc7/0x100
> [  565.069829]  dev_xdp_attach+0x205/0x5d0
> [  565.069841]  do_setlink+0x7d3/0xc20
> [  565.069853]  ? dequeue_skb+0x80/0x4f0
> [  565.069866]  ? __nla_validate_parse+0x125/0x1d0
> [  565.069880]  __rtnl_newlink+0x4f7/0x630
> [  565.069892]  ? __kmalloc_cache_noprof+0x225/0x2b0
> [  565.069905]  rtnl_newlink+0x44/0x70
> [  565.069915]  rtnetlink_rcv_msg+0x15c/0x410
> [  565.069928]  ? avc_has_perm_noaudit+0x67/0xf0
> [  565.069943]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> [  565.069956]  netlink_rcv_skb+0x57/0x100
> [  565.069969]  netlink_unicast+0x246/0x370
> [  565.069980]  netlink_sendmsg+0x1f6/0x430
> [  565.069991]  ____sys_sendmsg+0x3be/0x3f0
> [  565.070003]  ? import_iovec+0x16/0x20
> [  565.070015]  ? copy_msghdr_from_user+0x6d/0xa0
> [  565.070028]  ___sys_sendmsg+0x88/0xd0
> [  565.070038]  ? __memcg_slab_free_hook+0xd5/0x120
> [  565.070050]  ? __inode_wait_for_writeback+0x7d/0xf0
> [  565.070065]  ? mod_objcg_state+0xc9/0x2f0
> [  565.070076]  __sys_sendmsg+0x59/0xa0
> [  565.070086]  ? syscall_trace_enter+0xfb/0x190
> [  565.070098]  do_syscall_64+0x60/0x180
> [  565.070111]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [  565.070126] RIP: 0033:0x7f677ab0f94d
> [  565.070136] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 0a 67
> f7 ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f
> 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 44 24 08 e8 5e 67 f7 ff 48
> [  565.070164] RSP: 002b:00007ffd1e4f7a60 EFLAGS: 00000293 ORIG_RAX:
> 000000000000002e
> [  565.070178] RAX: ffffffffffffffda RBX: 0000000000000000 RCX:
> 00007f677ab0f94d
> [  565.070191] RDX: 0000000000000000 RSI: 000000001d698848 RDI:
> 000000000000000a
> [  565.070203] RBP: 000000001d5350e0 R08: 0000000000000000 R09:
> 0000000000465f98
> [  565.070215] R10: 0000000000200000 R11: 0000000000000293 R12:
> 000000001d535110
> [  565.070227] R13: 000000000051d798 R14: 000000001d698830 R15:
> 000000001d5384b0
> [  565.070240]  </TASK>
> [  565.070248] Modules linked in: bonding(E) tls(E) nft_fib_inet(E)
> nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E)
> nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E)
> nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_
> defrag_ipv6(E) nf_defrag_ipv4(E) rfkill(E) ip_set(E) nf_tables(E)
> libcrc32c(E) nfnetlink(E) vfat(E) fat(E) intel_rapl_msr(E)
> intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E)
> intel_powerclamp(E) coretemp(E) kvm_intel(E) ipmi_ssif(
> E) kvm(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) rapl(E)
> intel_cstate(E) intel_uncore(E) ast(E) i2c_i801(E) pcspkr(E) mei_me(E)
> drm_shmem_helper(E) mxm_wmi(E) drm_kms_helper(E) i2c_mux(E) mei(E)
> i2c_smbus(E) lpc_ich(E) ioat
> dma(E) acpi_power_meter(E) ipmi_si(E) acpi_ipmi(E) joydev(E)
> ipmi_devintf(E) ipmi_msghandler(E) acpi_pad(E) drm(E) fuse(E) ext4(E)
> mbcache(E) jbd2(E) sd_mod(E) sg(E) ice(E) ahci(E) crct10dif_pclmul(E)
> crc32_pclmul(E) crc32c_intel(E) lib
> ahci(E) polyval_clmulni(E) igb(E) polyval_generic(E) libata(E)
> ghash_clmulni_intel(E)
> [  565.070304]  i2c_algo_bit(E) dca(E) libie(E) wmi(E) dm_mirror(E)
> dm_region_hash(E) dm_log(E) dm_mod(E)
> [  565.071430] CR2: ffffa566707380c4
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ