linux-kernel - Re: [Regression 6.12] NULL pointer dereference in submit_bio_noacct via backing_file_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <kptrliv7cflmaven5mcfn3bywpwe7zrevw4qvuei6eqq3ubcaj@3n33v7w4bgfj>
Date: Fri, 16 Jan 2026 13:27:21 +0100
From: Jan Kara <jack@...e.cz>
To: Chenglong Tang <chenglongtang@...gle.com>
Cc: Amir Goldstein <amir73il@...il.com>, viro@...iv.linux.org.uk, 
	brauner@...nel.org, Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org, 
	linux-kernel@...r.kernel.org, miklos@...redi.hu
Subject: Re: [Regression 6.12] NULL pointer dereference in submit_bio_noacct
 via backing_file_read_iter

Hi!

On Thu 15-01-26 21:56:06, Chenglong Tang wrote:
> [Follow Up] We have an important update regarding the
> submit_bio_noacct panic we reported earlier.
> 
> To rule out the Integrity Measurement Architecture (IMA) as the root
> cause, we disabled IMA verification in the workload configuration. The
> kernel panic persisted with the exact same signature (RIP:
> 0010:submit_bio_noacct+0x21d), but the trigger path has changed.

OK, can you please feed this through addr2line so that we know what exactly
is wrong with the bio? Thanks!

Also do you have a chance to try with some recent upstream kernel? The
crash might also be specific to the set of backports in that particular
stable branch...

								Honza

> 
> New Stack Traces (Non-IMA) We are now observing the crash via two
> standard filesystem paths.
> 
> Stack Trace:
> Most failures are still similar:
> I0115 20:30:23.535402    8496 vex_console.cc:116] (vex1): [
> 158.519909] BUG: kernel NULL pointer dereference, address:
> 0000000000000156
> I0115 20:30:23.535483    8496 vex_console.cc:116] (vex1): [
> 158.542610] #PF: supervisor read access in kernel mode
> I0115 20:30:23.585675    8496 vex_console.cc:116] (vex1): [
> 158.565011] #PF: error_code(0x0000) - not-present page
> I0115 20:30:23.585702    8496 vex_console.cc:116] (vex1): [
> 158.583855] PGD 800000007c7da067 P4D 800000007c7da067 PUD 7c7db067 PMD
> 0
> I0115 20:30:23.585709    8496 vex_console.cc:116] (vex1): [
> 158.590940] Oops: Oops: 0000 [#1] SMP PTI
> I0115 20:30:23.636063    8496 vex_console.cc:116] (vex1): [
> 158.598950] CPU: 1 UID: 0 PID: 6717 Comm: agent_launcher Tainted: G
>        O       6.12.55+ #1
> I0115 20:30:23.636092    8496 vex_console.cc:116] (vex1): [
> 158.629624] Tainted: [O]=OOT_MODULE
> I0115 20:30:23.694223    8496 vex_console.cc:116] (vex1): [
> 158.639965] Hardware name: Google Google Compute Engine/Google Compute
> Engine, BIOS Google 01/01/2011
> I0115 20:30:23.694252    8496 vex_console.cc:116] (vex1): [
> 158.684210] RIP: 0010:submit_bio_noacct+0x21d/0x470
> I0115 20:30:23.738566    8496 vex_console.cc:116] (vex1): [
> 158.705662] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 46 af 89 01 49 83
> fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 99 ca 7d 01 00
> 7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 fc 9f
> 02
> I0115 20:30:23.738598    8496 vex_console.cc:116] (vex1): [
> 158.765443] RSP: 0000:ffffa74c84d53a98 EFLAGS: 00010202
> I0115 20:30:23.793126    8496 vex_console.cc:116] (vex1): [
> 158.771022] RAX: ffffa319b3d6b4f0 RBX: ffffa319bdc9a3c0 RCX:
> 00000000005e1070
> I0115 20:30:23.793158    8496 vex_console.cc:116] (vex1): [
> 158.778730] RDX: 0000000010300001 RSI: ffffa319b3d6b4f0 RDI:
> ffffa319bdc9a3c0
> I0115 20:30:23.843309    8496 vex_console.cc:116] (vex1): [
> 158.802189] RBP: ffffa74c84d53ac8 R08: 0000000000001000 R09:
> ffffa319bdc9a3c0
> I0115 20:30:23.843336    8496 vex_console.cc:116] (vex1): [
> 158.846780] R10: 0000000000000000 R11: 0000000069a1b000 R12:
> 0000000000000000
> I0115 20:30:23.889620    8484 vex_dns.cc:145] Returning NODATA for DNS
> Query: type=a, name=servicecontrol.googleapis.com.
> I0115 20:30:23.898357    8496 vex_console.cc:116] (vex1): [
> 158.877737] R13: ffffa31941421f40 R14: ffffa31955419200 R15:
> 0000000000000000
> I0115 20:30:23.948602    8496 vex_console.cc:116] (vex1): [
> 158.908715] FS:  00000000059efe28(0000) GS:ffffa319bdd00000(0000)
> knlGS:0000000000000000
> I0115 20:30:23.948640    8496 vex_console.cc:116] (vex1): [
> 158.937522] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> I0115 20:30:23.948645    8496 vex_console.cc:116] (vex1): [
> 158.958522] CR2: 0000000000000156 CR3: 000000006a20a003 CR4:
> 00000000003726f0
> I0115 20:30:23.948650    8496 vex_console.cc:116] (vex1): [
> 158.968648] Call Trace:
> I0115 20:30:23.948655    8496 vex_console.cc:116] (vex1): [  158.974419]  <TASK>
> I0115 20:30:23.948659    8496 vex_console.cc:116] (vex1): [
> 158.978222]  ext4_mpage_readpages+0x75c/0x790
> I0115 20:30:24.004540    8496 vex_console.cc:116] (vex1): [
> 158.983568]  read_pages+0x9d/0x250
> I0115 20:30:24.004568    8496 vex_console.cc:116] (vex1): [
> 158.987263]  page_cache_ra_unbounded+0xa2/0x1c0
> I0115 20:30:24.004573    8496 vex_console.cc:116] (vex1): [
> 158.992179]  filemap_fault+0x218/0x660
> I0115 20:30:24.004576    8496 vex_console.cc:116] (vex1): [
> 158.996311]  __do_fault+0x4b/0x140
> I0115 20:30:24.004580    8496 vex_console.cc:116] (vex1): [
> 159.000143]  do_pte_missing+0x14f/0x1050
> I0115 20:30:24.054563    8496 vex_console.cc:116] (vex1): [
> 159.018505]  handle_mm_fault+0x886/0xb40
> I0115 20:30:24.105692    8496 vex_console.cc:116] (vex1): [
> 159.063653]  do_user_addr_fault+0x1eb/0x730
> I0115 20:30:24.105721    8496 vex_console.cc:116] (vex1): [
> 159.094465]  exc_page_fault+0x80/0x100
> I0115 20:30:24.105726    8496 vex_console.cc:116] (vex1): [
> 159.116472]  asm_exc_page_fault+0x26/0x30
> 
> Though there is a different one:
> I0115 20:31:14.891091    7372 vex_console.cc:116] (vex1): [
> 163.902122] BUG: kernel NULL pointer dereference, address:
> 0000000000000157
> I0115 20:31:14.950131    7372 vex_console.cc:116] (vex1): [
> 163.955031] #PF: supervisor read access in kernel mode
> I0115 20:31:15.057629    7372 vex_console.cc:116] (vex1): [
> 163.986899] #PF: error_code(0x0000) - not-present page
> I0115 20:31:15.057665    7372 vex_console.cc:116] (vex1): [
> 164.075132] PGD 0 P4D 0
> I0115 20:31:15.057670    7372 vex_console.cc:116] (vex1): [
> 164.085940] Oops: Oops: 0000 [#1] SMP PTI
> I0115 20:31:15.108501    7372 vex_console.cc:116] (vex1): [
> 164.090592] CPU: 0 UID: 0 PID: 399 Comm: jbd2/nvme0n1p1- Tainted: G
>        O       6.12.55+ #1
> I0115 20:31:15.157731    7372 vex_console.cc:116] (vex1): [
> 164.146188] Tainted: [O]=OOT_MODULE
> I0115 20:31:15.210631    7372 vex_console.cc:116] (vex1): [
> 164.172362] Hardware name: Google Google Compute Engine/Google Compute
> Engine, BIOS Google 01/01/2011
> I0115 20:31:15.266673    7372 vex_console.cc:116] (vex1): [
> 164.243113] RIP: 0010:submit_bio_noacct+0x21d/0x470
> I0115 20:31:15.369886    7372 vex_console.cc:116] (vex1): [
> 164.276230] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 46 af 89 01 49 83
> fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 99 ca 7d 01 00
> 7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 fc 9f
> 02
> I0115 20:31:15.369913    7372 vex_console.cc:116] (vex1): [
> 164.413258] RSP: 0000:ffffa674004ebc80 EFLAGS: 00010202
> I0115 20:31:15.422131    7372 vex_console.cc:116] (vex1): [
> 164.420124] RAX: ffff9381c25d4790 RBX: ffff9381d0e5e540 RCX:
> 00000000000301c8
> I0115 20:31:15.522750    7372 vex_console.cc:116] (vex1): [
> 164.464474] RDX: 0000000010300001 RSI: ffff9381c25d4790 RDI:
> ffff9381d0e5e540
> I0115 20:31:15.522784    7372 vex_console.cc:116] (vex1): [
> 164.542751] RBP: ffffa674004ebcb0 R08: 0000000000000000 R09:
> 0000000000000000
> I0115 20:31:15.576921    7372 vex_console.cc:116] (vex1): [
> 164.578174] R10: 0000000000000000 R11: ffffffff8433e7a0 R12:
> 0000000000000000
> I0115 20:31:15.577224    7372 vex_console.cc:116] (vex1): [
> 164.595801] R13: ffff9381c1425780 R14: ffff9381c196d400 R15:
> 0000000000000001
> I0115 20:31:15.628049    7372 vex_console.cc:116] (vex1): [
> 164.626548] FS:  0000000000000000(0000) GS:ffff93823dc00000(0000)
> knlGS:0000000000000000
> I0115 20:31:15.732793    7372 vex_console.cc:116] (vex1): [
> 164.665104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> I0115 20:31:15.785564    7372 vex_console.cc:116] (vex1): [
> 164.757565] CR2: 0000000000000157 CR3: 000000007c678003 CR4:
> 00000000003726f0
> I0115 20:31:15.843034    7372 vex_console.cc:116] (vex1): [
> 164.831021] Call Trace:
> I0115 20:31:15.843065    7372 vex_console.cc:116] (vex1): [  164.851014]  <TASK>
> I0115 20:31:15.900287    7372 vex_console.cc:116] (vex1): [
> 164.872000]  jbd2_journal_commit_transaction+0x612/0x17e0
> I0115 20:31:15.900315    7372 vex_console.cc:116] (vex1): [
> 164.914012]  ? sched_clock+0xd/0x20
> I0115 20:31:15.952673    7372 vex_console.cc:116] (vex1): [
> 164.963930]  ? _raw_spin_unlock_irqrestore+0x12/0x30
> I0115 20:31:16.004440    7372 vex_console.cc:116] (vex1): [
> 164.989978]  ? __try_to_del_timer_sync+0x122/0x160
> I0115 20:31:16.004471    7372 vex_console.cc:116] (vex1): [
> 165.029451]  kjournald2+0xb1/0x220
> I0115 20:31:16.004477    7372 vex_console.cc:116] (vex1): [
> 165.033558]  ? __pfx_autoremove_wake_function+0x10/0x10
> I0115 20:31:16.004481    7372 vex_console.cc:116] (vex1): [
> 165.044022]  kthread+0x122/0x140
> I0115 20:31:16.004486    7372 vex_console.cc:116] (vex1): [
> 165.048012]  ? __pfx_kjournald2+0x10/0x10
> I0115 20:31:16.004490    7372 vex_console.cc:116] (vex1): [
> 165.052944]  ? __pfx_kthread+0x10/0x10
> I0115 20:31:16.004494    7372 vex_console.cc:116] (vex1): [
> 165.057597]  ret_from_fork+0x3f/0x50
> I0115 20:31:16.057453    7372 vex_console.cc:116] (vex1): [
> 165.062127]  ? __pfx_kthread+0x10/0x10
> I0115 20:31:16.057484    7372 vex_console.cc:116] (vex1): [
> 165.079674]  ret_from_fork_asm+0x1a/0x30
> I0115 20:31:16.109674    7372 vex_console.cc:116] (vex1): [
> 165.113023]  </TASK>
> I0115 20:31:16.212548    7372 vex_console.cc:116] (vex1): [
> 165.131001] Modules linked in: nft_chain_nat xt_MASQUERADE nf_nat
> xt_addrtype nft_compat nf_tables kvm_intel kvm irqbypass crc32c_intel
> aesni_intel crypto_simd cryptd loadpin_trigger(O) fuse
> I0115 20:31:16.262933    7372 vex_console.cc:116] (vex1): [
> 165.269971] CR2: 0000000000000157
> I0115 20:31:16.316433    7372 vex_console.cc:116] (vex1): [
> 165.306980] ---[ end trace 0000000000000000 ]---
> I0115 20:31:16.365756    7372 vex_console.cc:116] (vex1): [
> 165.361889] RIP: 0010:submit_bio_noacct+0x21d/0x470
> I0115 20:31:16.518250    7372 vex_console.cc:116] (vex1): [
> 165.406957] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 46 af 89 01 49 83
> fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 99 ca 7d 01 00
> 7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 fc 9f
> 02
> I0115 20:31:16.518278    7372 vex_console.cc:116] (vex1): [
> 165.558880] RSP: 0000:ffffa674004ebc80 EFLAGS: 00010202
> I0115 20:31:16.568463    7372 vex_console.cc:116] (vex1): [
> 165.575239] RAX: ffff9381c25d4790 RBX: ffff9381d0e5e540 RCX:
> 00000000000301c8
> I0115 20:31:16.568490    7372 vex_console.cc:116] (vex1): [
> 165.590012] RDX: 0000000010300001 RSI: ffff9381c25d4790 RDI:
> ffff9381d0e5e540
> I0115 20:31:16.568495    7372 vex_console.cc:116] (vex1): [
> 165.597793] RBP: ffffa674004ebcb0 R08: 0000000000000000 R09:
> 0000000000000000
> I0115 20:31:16.568499    7372 vex_console.cc:116] (vex1): [
> 165.608408] R10: 0000000000000000 R11: ffffffff8433e7a0 R12:
> 0000000000000000
> I0115 20:31:16.568502    7372 vex_console.cc:116] (vex1): [
> 165.616602] R13: ffff9381c1425780 R14: ffff9381c196d400 R15:
> 0000000000000001
> I0115 20:31:16.618734    7372 vex_console.cc:116] (vex1): [
> 165.631823] FS:  0000000000000000(0000) GS:ffff93823dc00000(0000)
> knlGS:0000000000000000
> I0115 20:31:16.618770    7372 vex_console.cc:116] (vex1): [
> 165.653088] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> W0115 20:31:16.649110    7355 pvpanic.cc:136] Guest kernel has panicked!
> I0115 20:31:16.671568    7372 vex_console.cc:116] (vex1): [
> 165.668488] CR2: 0000000000000157 CR3: 000000007c678003 CR4:
> 00000000003726f0
> I0115 20:31:16.671599    7372 vex_console.cc:116] (vex1): [
> 165.686744] Kernel panic - not syncing: Fatal exception
> 
> This confirms the issue is not specific to IMA, but is a fundamental
> race condition in the Block I/O layer or Ext4 subsystem under high
> concurrency.
> 
> Since the crash occurs at the exact same instruction offset in
> submit_bio_noacct regardless of the caller (IMA, Page Fault, or JBD2),
> we suspect a bio or request_queue structure is being corrupted or
> hitting a NULL pointer dereference in the underlying block device
> driver (NVMe) or Device Mapper.
> 
> Best,
> 
> Chenglong
> 
> On Thu, Jan 15, 2026 at 6:56 PM Chenglong Tang <chenglongtang@...gle.com> wrote:
> >
> > Hi Amir,
> >
> > Thanks for the guidance. Using the specific order of the 8 commits
> > (applying the ovl_real_fdget refactors before the fix consumers)
> > resolved the boot-time NULL pointer panic. The system now boots
> > successfully.
> >
> > However, we are still hitting the original kernel panic during runtime
> > tests (specifically a CloudSQL workload).
> >
> > Current Commit Chain (Applied to 6.12):
> >
> > 76d83345a056 (HEAD -> main-R125-cos-6.12) ovl: convert
> > ovl_real_fdget() callers to ovl_real_file()
> > 740bdf920b15 ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()
> > 100b71ecb237 fs/backing_file: fix wrong argument in callback
> > b877bca6858d ovl: store upper real file in ovl_file struct
> > 595aac630596 ovl: allocate a container struct ovl_file for ovl private context
> > 218ec543008d ovl: do not open non-data lower file for fsync
> > 6def078942e2 ovl: use wrapper ovl_revert_creds()
> > fe73aad71936 backing-file: clean up the API
> >
> > So it means none of these 8 commits were able to fix the problem. Let
> > me explain what's going on here:
> >
> > We are reporting a rare but persistent kernel panic (~0.02% failure
> > rate) occurring during container initialization on Linux 6.12.55+
> > (x86_64). The 6.6.x is good. The panic is a NULL pointer dereference
> > in submit_bio_noacct, triggered specifically when the Integrity
> > Measurement Architecture (IMA) calculates a file hash during a runc
> > create operation.
> >
> > We have isolated the crash to a specific container (ncsa) starting up
> > during a high-concurrency boot sequence.
> >
> > Environment
> > * Kernel: Linux 6.12.55+ (x86_64) / Container-Optimized OS
> > * Workload: Cloud SQL instance initialization (heavy concurrent runc
> > operations managed by systemd).
> > * Filesystem: Ext4 backed by NVMe.
> > * Security: AppArmor enabled, IMA (Integrity Measurement Architecture) active.
> >
> > The Failure Pattern(In every crash instance, the sequence is identical):
> > * systemd initiates the startup of the ncsainit container.
> > * runc executes the create command:
> > `Bash
> > `runc --root /var/lib/cloudsql/runc/root create --bundle
> > /var/lib/cloudsql/runc/bundles/ncsa ...
> >
> > Immediately after this command is logged, the kernel panics.
> >
> > Stacktrace:
> > [  186.938290] BUG: kernel NULL pointer dereference, address: 0000000000000156
> > [  186.952203] #PF: supervisor read access in kernel mode
> > [  186.995248] Oops: Oops: 0000 [#1] SMP PTI
> > [  187.035946] CPU: 1 UID: 0 PID: 6764 Comm: runc:[2:INIT] Tainted: G
> >          O       6.12.55+ #1
> > [  187.081681] RIP: 0010:submit_bio_noacct+0x21d/0x470
> > [  187.412981] Call Trace:
> > [  187.415751]  <TASK>
> > [  187.418141]  ext4_mpage_readpages+0x75c/0x790
> > [  187.429011]  read_pages+0x9d/0x250
> > [  187.450963]  page_cache_ra_unbounded+0xa2/0x1c0
> > [  187.466083]  filemap_get_pages+0x231/0x7a0
> > [  187.474687]  filemap_read+0xf6/0x440
> > [  187.532345]  integrity_kernel_read+0x34/0x60
> > [  187.560740]  ima_calc_file_hash+0x1c1/0x9b0
> > [  187.608175]  ima_collect_measurement+0x1b6/0x310
> > [  187.613102]  process_measurement+0x4ea/0x850
> > [  187.617788]  ima_bprm_check+0x5b/0xc0
> > [  187.635403]  bprm_execve+0x203/0x560
> > [  187.645058]  do_execveat_common+0x2fb/0x360
> > [  187.649730]  __x64_sys_execve+0x3e/0x50
> >
> > Panic Analysis: The stack trace indicates a race condition where
> > ima_bprm_check (triggered by executing the container binary) attempts
> > to verify the file. This calls ima_calc_file_hash ->
> > ext4_mpage_readpages, which submits a bio to the block layer.
> >
> > The crash occurs in submit_bio_noacct when it attempts to dereference
> > a member of the bio structure (likely bio->bi_bdev or the request
> > queue), suggesting the underlying device or queue structure is either
> > uninitialized or has been torn down while the IMA check was still in
> > flight.
> >
> > Context on Concurrency: This workload involves systemd starting
> > multiple sidecar containers (logging, monitoring, coroner, etc.)
> > simultaneously. We suspect this high-concurrency startup creates the
> > IO/CPU contention required to hit this race window. However, the crash
> > consistently happens only on the ncsa container, implying something
> > specific about its launch configuration or timing makes it the
> > reliable victim.
> >
> > Best,
> >
> > Chenglong
> >
> > On Wed, Jan 14, 2026 at 3:11 AM Amir Goldstein <amir73il@...il.com> wrote:
> > >
> > > On Wed, Jan 14, 2026 at 1:53 AM Chenglong Tang <chenglongtang@...gle.com> wrote:
> > > >
> > > > Hi OverlayFS Maintainers,
> > > >
> > > > This is from Container Optimized OS in Google Cloud.
> > > >
> > > > We are reporting a reproducible kernel panic on Kernel 6.12 involving
> > > > a NULL pointer dereference in submit_bio_noacct.
> > > >
> > > > The Issue: The panic occurs intermittently (approx. 5 failures in 1000
> > > > runs) during a specific PostgreSQL client test
> > > > (postgres_client_test_postgres15_ctrdncsa) on Google
> > > > Container-Optimized OS. The stack trace shows the crash happens when
> > > > IMA (ima_calc_file_hash) attempts to read a file from OverlayFS via
> > > > the new-in-6.12 backing_file_read_iter helper.
> > > >
> > > > It appears to be a race condition where the underlying block device is
> > > > detached (becoming NULL) while the backing_file wrapper is still
> > > > attempting to submit a read bio during container teardown.
> > > >
> > > > Stack Trace:
> > > > [  OK  ] Started    75.793015] BUG: kernel NULL pointer dereference,
> > > > address: 0000000000000156
> > > > [   75.822539] #PF: supervisor read access in kernel mode
> > > > [   75.849332] #PF: error_code(0x0000) - not-present page
> > > > [   75.862775] PGD 7d012067 P4D 7d012067 PUD 7d013067 PMD 0
> > > > [   75.884283] Oops: Oops: 0000 [#1] SMP NOPTI
> > > > [   75.902274] CPU: 1 UID: 0 PID: 6476 Comm: helmd Tainted: G
> > > >  O       6.12.55+ #1
> > > > [   75.928903] Tainted: [O]=OOT_MODULE
> > > > [   75.942484] Hardware name: Google Google Compute Engine/Google
> > > > Compute Engine, BIOS Google 01/01/2011
> > > > [   75.965868] RIP: 0010:submit_bio_noacct+0x21d/0x470
> > > > [   75.978340] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 b6 ad 89 01 49
> > > > 83 fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 09 c9 7d 01
> > > > 00 7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 4c
> > > > a0 02
> > > > [   76.035847] RSP: 0018:ffffa41183463880 EFLAGS: 00010202
> > > > [   76.050141] RAX: ffff9d4ec1a81a78 RBX: ffff9d4f3811e6c0 RCX: 00000000009410a0
> > > > [   76.065176] RDX: 0000000010300001 RSI: ffff9d4ec1a81a78 RDI: ffff9d4f3811e6c0
> > > > [   76.089292] RBP: ffffa411834638b0 R08: 0000000000001000 R09: ffff9d4f3811e6c0
> > > > [   76.110878] R10: 2000000000000000 R11: ffffffff8a33e700 R12: 0000000000000000
> > > > [   76.139068] R13: ffff9d4ec1422bc0 R14: ffff9d4ec2507000 R15: 0000000000000000
> > > > [   76.168391] FS:  0000000008df7f40(0000) GS:ffff9d4f3dd00000(0000)
> > > > knlGS:0000000000000000
> > > > [   76.179024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [   76.184951] CR2: 0000000000000156 CR3: 000000007d01c006 CR4: 0000000000370ef0
> > > > [   76.192352] Call Trace:
> > > > [   76.194981]  <TASK>
> > > > [   76.197257]  ext4_mpage_readpages+0x75c/0x790
> > > > [   76.201794]  read_pages+0xa0/0x250
> > > > [   76.205373]  page_cache_ra_unbounded+0xa2/0x1c0
> > > > [   76.232608]  filemap_get_pages+0x16b/0x7a0
> > > > [   76.254151]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > > [   76.260523]  filemap_read+0xf6/0x440
> > > > [   76.264540]  do_iter_readv_writev+0x17e/0x1c0
> > > > [   76.275427]  vfs_iter_read+0x8a/0x140
> > > > [   76.279272]  backing_file_read_iter+0x155/0x250
> > > > [   76.284425]  ovl_read_iter+0xd7/0x120
> > > > [   76.288270]  ? __pfx_ovl_file_accessed+0x10/0x10
> > > > [   76.293069]  vfs_read+0x2b1/0x300
> > > > [   76.296835]  ksys_read+0x75/0xe0
> > > > [   76.300246]  do_syscall_64+0x61/0x130
> > > > [   76.304173]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > >
> > > > Our Findings:
> > > >
> > > > Not an Ext4 regression: We verified that reverting "ext4: reduce stack
> > > > usage in ext4_mpage_readpages()" does not resolve the panic.
> > > >
> > > > Suspected Fix: We suspect upstream commit 18e48d0e2c7b ("ovl: store
> > > > upper real file in ovl_file struct") is the correct fix. It seems to
> > > > address this exact lifetime race by persistently pinning the
> > > > underlying file.
> > >
> > > That sounds odd.
> > > Using a persistent upper real file may be more efficient than opening
> > > a temporary file for every read, but the temporary file is a legit opened file,
> > > so it looks like you would be averting the race rather than fixing it.
> > >
> > > Could you try to analyse the conditions that caused the race?
> > >
> > > >
> > > > The Problem: We cannot apply 18e48d0e2c7b to 6.12 stable because it
> > > > depends on the extensive ovl_real_file refactoring series (removing
> > > > ovl_real_fdget family functions) that landed in 6.13.
> > > >
> > > > Is there a recommended way to backport the "persistent real file"
> > > > logic to 6.12 without pulling in the entire refactor chain?
> > > >
> > >
> > > These are the commits in overlayfs/file.c v6.12..v6.13:
> > >
> > > $ git log --oneline  v6.12..v6.13 -- fs/overlayfs/file.c
> > > d66907b51ba07 ovl: convert ovl_real_fdget() callers to ovl_real_file()
> > > 4333e42ed4444 ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()
> > > 18e48d0e2c7b1 ovl: store upper real file in ovl_file struct
> > > 87a8a76c34a2a ovl: allocate a container struct ovl_file for ovl private context
> > > c2c54b5f34f63 ovl: do not open non-data lower file for fsync
> > > fc5a1d2287bf2 ovl: use wrapper ovl_revert_creds()
> > > 48b50624aec45 backing-file: clean up the API
> > >
> > > Your claim that 18e48d0e2c7b depends on ovl_real_fdget() is incorrect.
> > > You may safely cherry-pick the 4 commits above leading to 18e48d0e2c7b1.
> > > They are all self contained changes that would be good to have in 6.12.y,
> > > because they would make cherry-picking future fixes easier.
> > >
> > > Specifically, backing-file: clean up the API, it is better to have the same
> > > API in upstream and stable kernels.
> > >
> > > Thanks,
> > > Amir.
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR