lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOdxtTZv_B_pE1d1vgaE8+ar58y7pTiw0bL-djB1rhE-5wu2zQ@mail.gmail.com>
Date: Thu, 15 Jan 2026 21:56:06 -0800
From: Chenglong Tang <chenglongtang@...gle.com>
To: Amir Goldstein <amir73il@...il.com>
Cc: viro@...iv.linux.org.uk, brauner@...nel.org, Jan Kara <jack@...e.cz>, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	miklos@...redi.hu
Subject: Re: [Regression 6.12] NULL pointer dereference in submit_bio_noacct
 via backing_file_read_iter

[Follow Up] We have an important update regarding the
submit_bio_noacct panic we reported earlier.

To rule out the Integrity Measurement Architecture (IMA) as the root
cause, we disabled IMA verification in the workload configuration. The
kernel panic persisted with the exact same signature (RIP:
0010:submit_bio_noacct+0x21d), but the trigger path has changed.

New Stack Traces (Non-IMA) We are now observing the crash via two
standard filesystem paths.

Stack Trace:
Most failures are still similar:
I0115 20:30:23.535402    8496 vex_console.cc:116] (vex1): [
158.519909] BUG: kernel NULL pointer dereference, address:
0000000000000156
I0115 20:30:23.535483    8496 vex_console.cc:116] (vex1): [
158.542610] #PF: supervisor read access in kernel mode
I0115 20:30:23.585675    8496 vex_console.cc:116] (vex1): [
158.565011] #PF: error_code(0x0000) - not-present page
I0115 20:30:23.585702    8496 vex_console.cc:116] (vex1): [
158.583855] PGD 800000007c7da067 P4D 800000007c7da067 PUD 7c7db067 PMD
0
I0115 20:30:23.585709    8496 vex_console.cc:116] (vex1): [
158.590940] Oops: Oops: 0000 [#1] SMP PTI
I0115 20:30:23.636063    8496 vex_console.cc:116] (vex1): [
158.598950] CPU: 1 UID: 0 PID: 6717 Comm: agent_launcher Tainted: G
       O       6.12.55+ #1
I0115 20:30:23.636092    8496 vex_console.cc:116] (vex1): [
158.629624] Tainted: [O]=OOT_MODULE
I0115 20:30:23.694223    8496 vex_console.cc:116] (vex1): [
158.639965] Hardware name: Google Google Compute Engine/Google Compute
Engine, BIOS Google 01/01/2011
I0115 20:30:23.694252    8496 vex_console.cc:116] (vex1): [
158.684210] RIP: 0010:submit_bio_noacct+0x21d/0x470
I0115 20:30:23.738566    8496 vex_console.cc:116] (vex1): [
158.705662] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 46 af 89 01 49 83
fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 99 ca 7d 01 00
7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 fc 9f
02
I0115 20:30:23.738598    8496 vex_console.cc:116] (vex1): [
158.765443] RSP: 0000:ffffa74c84d53a98 EFLAGS: 00010202
I0115 20:30:23.793126    8496 vex_console.cc:116] (vex1): [
158.771022] RAX: ffffa319b3d6b4f0 RBX: ffffa319bdc9a3c0 RCX:
00000000005e1070
I0115 20:30:23.793158    8496 vex_console.cc:116] (vex1): [
158.778730] RDX: 0000000010300001 RSI: ffffa319b3d6b4f0 RDI:
ffffa319bdc9a3c0
I0115 20:30:23.843309    8496 vex_console.cc:116] (vex1): [
158.802189] RBP: ffffa74c84d53ac8 R08: 0000000000001000 R09:
ffffa319bdc9a3c0
I0115 20:30:23.843336    8496 vex_console.cc:116] (vex1): [
158.846780] R10: 0000000000000000 R11: 0000000069a1b000 R12:
0000000000000000
I0115 20:30:23.889620    8484 vex_dns.cc:145] Returning NODATA for DNS
Query: type=a, name=servicecontrol.googleapis.com.
I0115 20:30:23.898357    8496 vex_console.cc:116] (vex1): [
158.877737] R13: ffffa31941421f40 R14: ffffa31955419200 R15:
0000000000000000
I0115 20:30:23.948602    8496 vex_console.cc:116] (vex1): [
158.908715] FS:  00000000059efe28(0000) GS:ffffa319bdd00000(0000)
knlGS:0000000000000000
I0115 20:30:23.948640    8496 vex_console.cc:116] (vex1): [
158.937522] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
I0115 20:30:23.948645    8496 vex_console.cc:116] (vex1): [
158.958522] CR2: 0000000000000156 CR3: 000000006a20a003 CR4:
00000000003726f0
I0115 20:30:23.948650    8496 vex_console.cc:116] (vex1): [
158.968648] Call Trace:
I0115 20:30:23.948655    8496 vex_console.cc:116] (vex1): [  158.974419]  <TASK>
I0115 20:30:23.948659    8496 vex_console.cc:116] (vex1): [
158.978222]  ext4_mpage_readpages+0x75c/0x790
I0115 20:30:24.004540    8496 vex_console.cc:116] (vex1): [
158.983568]  read_pages+0x9d/0x250
I0115 20:30:24.004568    8496 vex_console.cc:116] (vex1): [
158.987263]  page_cache_ra_unbounded+0xa2/0x1c0
I0115 20:30:24.004573    8496 vex_console.cc:116] (vex1): [
158.992179]  filemap_fault+0x218/0x660
I0115 20:30:24.004576    8496 vex_console.cc:116] (vex1): [
158.996311]  __do_fault+0x4b/0x140
I0115 20:30:24.004580    8496 vex_console.cc:116] (vex1): [
159.000143]  do_pte_missing+0x14f/0x1050
I0115 20:30:24.054563    8496 vex_console.cc:116] (vex1): [
159.018505]  handle_mm_fault+0x886/0xb40
I0115 20:30:24.105692    8496 vex_console.cc:116] (vex1): [
159.063653]  do_user_addr_fault+0x1eb/0x730
I0115 20:30:24.105721    8496 vex_console.cc:116] (vex1): [
159.094465]  exc_page_fault+0x80/0x100
I0115 20:30:24.105726    8496 vex_console.cc:116] (vex1): [
159.116472]  asm_exc_page_fault+0x26/0x30

Though there is a different one:
I0115 20:31:14.891091    7372 vex_console.cc:116] (vex1): [
163.902122] BUG: kernel NULL pointer dereference, address:
0000000000000157
I0115 20:31:14.950131    7372 vex_console.cc:116] (vex1): [
163.955031] #PF: supervisor read access in kernel mode
I0115 20:31:15.057629    7372 vex_console.cc:116] (vex1): [
163.986899] #PF: error_code(0x0000) - not-present page
I0115 20:31:15.057665    7372 vex_console.cc:116] (vex1): [
164.075132] PGD 0 P4D 0
I0115 20:31:15.057670    7372 vex_console.cc:116] (vex1): [
164.085940] Oops: Oops: 0000 [#1] SMP PTI
I0115 20:31:15.108501    7372 vex_console.cc:116] (vex1): [
164.090592] CPU: 0 UID: 0 PID: 399 Comm: jbd2/nvme0n1p1- Tainted: G
       O       6.12.55+ #1
I0115 20:31:15.157731    7372 vex_console.cc:116] (vex1): [
164.146188] Tainted: [O]=OOT_MODULE
I0115 20:31:15.210631    7372 vex_console.cc:116] (vex1): [
164.172362] Hardware name: Google Google Compute Engine/Google Compute
Engine, BIOS Google 01/01/2011
I0115 20:31:15.266673    7372 vex_console.cc:116] (vex1): [
164.243113] RIP: 0010:submit_bio_noacct+0x21d/0x470
I0115 20:31:15.369886    7372 vex_console.cc:116] (vex1): [
164.276230] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 46 af 89 01 49 83
fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 99 ca 7d 01 00
7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 fc 9f
02
I0115 20:31:15.369913    7372 vex_console.cc:116] (vex1): [
164.413258] RSP: 0000:ffffa674004ebc80 EFLAGS: 00010202
I0115 20:31:15.422131    7372 vex_console.cc:116] (vex1): [
164.420124] RAX: ffff9381c25d4790 RBX: ffff9381d0e5e540 RCX:
00000000000301c8
I0115 20:31:15.522750    7372 vex_console.cc:116] (vex1): [
164.464474] RDX: 0000000010300001 RSI: ffff9381c25d4790 RDI:
ffff9381d0e5e540
I0115 20:31:15.522784    7372 vex_console.cc:116] (vex1): [
164.542751] RBP: ffffa674004ebcb0 R08: 0000000000000000 R09:
0000000000000000
I0115 20:31:15.576921    7372 vex_console.cc:116] (vex1): [
164.578174] R10: 0000000000000000 R11: ffffffff8433e7a0 R12:
0000000000000000
I0115 20:31:15.577224    7372 vex_console.cc:116] (vex1): [
164.595801] R13: ffff9381c1425780 R14: ffff9381c196d400 R15:
0000000000000001
I0115 20:31:15.628049    7372 vex_console.cc:116] (vex1): [
164.626548] FS:  0000000000000000(0000) GS:ffff93823dc00000(0000)
knlGS:0000000000000000
I0115 20:31:15.732793    7372 vex_console.cc:116] (vex1): [
164.665104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
I0115 20:31:15.785564    7372 vex_console.cc:116] (vex1): [
164.757565] CR2: 0000000000000157 CR3: 000000007c678003 CR4:
00000000003726f0
I0115 20:31:15.843034    7372 vex_console.cc:116] (vex1): [
164.831021] Call Trace:
I0115 20:31:15.843065    7372 vex_console.cc:116] (vex1): [  164.851014]  <TASK>
I0115 20:31:15.900287    7372 vex_console.cc:116] (vex1): [
164.872000]  jbd2_journal_commit_transaction+0x612/0x17e0
I0115 20:31:15.900315    7372 vex_console.cc:116] (vex1): [
164.914012]  ? sched_clock+0xd/0x20
I0115 20:31:15.952673    7372 vex_console.cc:116] (vex1): [
164.963930]  ? _raw_spin_unlock_irqrestore+0x12/0x30
I0115 20:31:16.004440    7372 vex_console.cc:116] (vex1): [
164.989978]  ? __try_to_del_timer_sync+0x122/0x160
I0115 20:31:16.004471    7372 vex_console.cc:116] (vex1): [
165.029451]  kjournald2+0xb1/0x220
I0115 20:31:16.004477    7372 vex_console.cc:116] (vex1): [
165.033558]  ? __pfx_autoremove_wake_function+0x10/0x10
I0115 20:31:16.004481    7372 vex_console.cc:116] (vex1): [
165.044022]  kthread+0x122/0x140
I0115 20:31:16.004486    7372 vex_console.cc:116] (vex1): [
165.048012]  ? __pfx_kjournald2+0x10/0x10
I0115 20:31:16.004490    7372 vex_console.cc:116] (vex1): [
165.052944]  ? __pfx_kthread+0x10/0x10
I0115 20:31:16.004494    7372 vex_console.cc:116] (vex1): [
165.057597]  ret_from_fork+0x3f/0x50
I0115 20:31:16.057453    7372 vex_console.cc:116] (vex1): [
165.062127]  ? __pfx_kthread+0x10/0x10
I0115 20:31:16.057484    7372 vex_console.cc:116] (vex1): [
165.079674]  ret_from_fork_asm+0x1a/0x30
I0115 20:31:16.109674    7372 vex_console.cc:116] (vex1): [
165.113023]  </TASK>
I0115 20:31:16.212548    7372 vex_console.cc:116] (vex1): [
165.131001] Modules linked in: nft_chain_nat xt_MASQUERADE nf_nat
xt_addrtype nft_compat nf_tables kvm_intel kvm irqbypass crc32c_intel
aesni_intel crypto_simd cryptd loadpin_trigger(O) fuse
I0115 20:31:16.262933    7372 vex_console.cc:116] (vex1): [
165.269971] CR2: 0000000000000157
I0115 20:31:16.316433    7372 vex_console.cc:116] (vex1): [
165.306980] ---[ end trace 0000000000000000 ]---
I0115 20:31:16.365756    7372 vex_console.cc:116] (vex1): [
165.361889] RIP: 0010:submit_bio_noacct+0x21d/0x470
I0115 20:31:16.518250    7372 vex_console.cc:116] (vex1): [
165.406957] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 46 af 89 01 49 83
fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 99 ca 7d 01 00
7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 fc 9f
02
I0115 20:31:16.518278    7372 vex_console.cc:116] (vex1): [
165.558880] RSP: 0000:ffffa674004ebc80 EFLAGS: 00010202
I0115 20:31:16.568463    7372 vex_console.cc:116] (vex1): [
165.575239] RAX: ffff9381c25d4790 RBX: ffff9381d0e5e540 RCX:
00000000000301c8
I0115 20:31:16.568490    7372 vex_console.cc:116] (vex1): [
165.590012] RDX: 0000000010300001 RSI: ffff9381c25d4790 RDI:
ffff9381d0e5e540
I0115 20:31:16.568495    7372 vex_console.cc:116] (vex1): [
165.597793] RBP: ffffa674004ebcb0 R08: 0000000000000000 R09:
0000000000000000
I0115 20:31:16.568499    7372 vex_console.cc:116] (vex1): [
165.608408] R10: 0000000000000000 R11: ffffffff8433e7a0 R12:
0000000000000000
I0115 20:31:16.568502    7372 vex_console.cc:116] (vex1): [
165.616602] R13: ffff9381c1425780 R14: ffff9381c196d400 R15:
0000000000000001
I0115 20:31:16.618734    7372 vex_console.cc:116] (vex1): [
165.631823] FS:  0000000000000000(0000) GS:ffff93823dc00000(0000)
knlGS:0000000000000000
I0115 20:31:16.618770    7372 vex_console.cc:116] (vex1): [
165.653088] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
W0115 20:31:16.649110    7355 pvpanic.cc:136] Guest kernel has panicked!
I0115 20:31:16.671568    7372 vex_console.cc:116] (vex1): [
165.668488] CR2: 0000000000000157 CR3: 000000007c678003 CR4:
00000000003726f0
I0115 20:31:16.671599    7372 vex_console.cc:116] (vex1): [
165.686744] Kernel panic - not syncing: Fatal exception

This confirms the issue is not specific to IMA, but is a fundamental
race condition in the Block I/O layer or Ext4 subsystem under high
concurrency.

Since the crash occurs at the exact same instruction offset in
submit_bio_noacct regardless of the caller (IMA, Page Fault, or JBD2),
we suspect a bio or request_queue structure is being corrupted or
hitting a NULL pointer dereference in the underlying block device
driver (NVMe) or Device Mapper.

Best,

Chenglong

On Thu, Jan 15, 2026 at 6:56 PM Chenglong Tang <chenglongtang@...gle.com> wrote:
>
> Hi Amir,
>
> Thanks for the guidance. Using the specific order of the 8 commits
> (applying the ovl_real_fdget refactors before the fix consumers)
> resolved the boot-time NULL pointer panic. The system now boots
> successfully.
>
> However, we are still hitting the original kernel panic during runtime
> tests (specifically a CloudSQL workload).
>
> Current Commit Chain (Applied to 6.12):
>
> 76d83345a056 (HEAD -> main-R125-cos-6.12) ovl: convert
> ovl_real_fdget() callers to ovl_real_file()
> 740bdf920b15 ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()
> 100b71ecb237 fs/backing_file: fix wrong argument in callback
> b877bca6858d ovl: store upper real file in ovl_file struct
> 595aac630596 ovl: allocate a container struct ovl_file for ovl private context
> 218ec543008d ovl: do not open non-data lower file for fsync
> 6def078942e2 ovl: use wrapper ovl_revert_creds()
> fe73aad71936 backing-file: clean up the API
>
> So it means none of these 8 commits were able to fix the problem. Let
> me explain what's going on here:
>
> We are reporting a rare but persistent kernel panic (~0.02% failure
> rate) occurring during container initialization on Linux 6.12.55+
> (x86_64). The 6.6.x is good. The panic is a NULL pointer dereference
> in submit_bio_noacct, triggered specifically when the Integrity
> Measurement Architecture (IMA) calculates a file hash during a runc
> create operation.
>
> We have isolated the crash to a specific container (ncsa) starting up
> during a high-concurrency boot sequence.
>
> Environment
> * Kernel: Linux 6.12.55+ (x86_64) / Container-Optimized OS
> * Workload: Cloud SQL instance initialization (heavy concurrent runc
> operations managed by systemd).
> * Filesystem: Ext4 backed by NVMe.
> * Security: AppArmor enabled, IMA (Integrity Measurement Architecture) active.
>
> The Failure Pattern(In every crash instance, the sequence is identical):
> * systemd initiates the startup of the ncsainit container.
> * runc executes the create command:
> `Bash
> `runc --root /var/lib/cloudsql/runc/root create --bundle
> /var/lib/cloudsql/runc/bundles/ncsa ...
>
> Immediately after this command is logged, the kernel panics.
>
> Stacktrace:
> [  186.938290] BUG: kernel NULL pointer dereference, address: 0000000000000156
> [  186.952203] #PF: supervisor read access in kernel mode
> [  186.995248] Oops: Oops: 0000 [#1] SMP PTI
> [  187.035946] CPU: 1 UID: 0 PID: 6764 Comm: runc:[2:INIT] Tainted: G
>          O       6.12.55+ #1
> [  187.081681] RIP: 0010:submit_bio_noacct+0x21d/0x470
> [  187.412981] Call Trace:
> [  187.415751]  <TASK>
> [  187.418141]  ext4_mpage_readpages+0x75c/0x790
> [  187.429011]  read_pages+0x9d/0x250
> [  187.450963]  page_cache_ra_unbounded+0xa2/0x1c0
> [  187.466083]  filemap_get_pages+0x231/0x7a0
> [  187.474687]  filemap_read+0xf6/0x440
> [  187.532345]  integrity_kernel_read+0x34/0x60
> [  187.560740]  ima_calc_file_hash+0x1c1/0x9b0
> [  187.608175]  ima_collect_measurement+0x1b6/0x310
> [  187.613102]  process_measurement+0x4ea/0x850
> [  187.617788]  ima_bprm_check+0x5b/0xc0
> [  187.635403]  bprm_execve+0x203/0x560
> [  187.645058]  do_execveat_common+0x2fb/0x360
> [  187.649730]  __x64_sys_execve+0x3e/0x50
>
> Panic Analysis: The stack trace indicates a race condition where
> ima_bprm_check (triggered by executing the container binary) attempts
> to verify the file. This calls ima_calc_file_hash ->
> ext4_mpage_readpages, which submits a bio to the block layer.
>
> The crash occurs in submit_bio_noacct when it attempts to dereference
> a member of the bio structure (likely bio->bi_bdev or the request
> queue), suggesting the underlying device or queue structure is either
> uninitialized or has been torn down while the IMA check was still in
> flight.
>
> Context on Concurrency: This workload involves systemd starting
> multiple sidecar containers (logging, monitoring, coroner, etc.)
> simultaneously. We suspect this high-concurrency startup creates the
> IO/CPU contention required to hit this race window. However, the crash
> consistently happens only on the ncsa container, implying something
> specific about its launch configuration or timing makes it the
> reliable victim.
>
> Best,
>
> Chenglong
>
> On Wed, Jan 14, 2026 at 3:11 AM Amir Goldstein <amir73il@...il.com> wrote:
> >
> > On Wed, Jan 14, 2026 at 1:53 AM Chenglong Tang <chenglongtang@...gle.com> wrote:
> > >
> > > Hi OverlayFS Maintainers,
> > >
> > > This is from Container Optimized OS in Google Cloud.
> > >
> > > We are reporting a reproducible kernel panic on Kernel 6.12 involving
> > > a NULL pointer dereference in submit_bio_noacct.
> > >
> > > The Issue: The panic occurs intermittently (approx. 5 failures in 1000
> > > runs) during a specific PostgreSQL client test
> > > (postgres_client_test_postgres15_ctrdncsa) on Google
> > > Container-Optimized OS. The stack trace shows the crash happens when
> > > IMA (ima_calc_file_hash) attempts to read a file from OverlayFS via
> > > the new-in-6.12 backing_file_read_iter helper.
> > >
> > > It appears to be a race condition where the underlying block device is
> > > detached (becoming NULL) while the backing_file wrapper is still
> > > attempting to submit a read bio during container teardown.
> > >
> > > Stack Trace:
> > > [  OK  ] Started    75.793015] BUG: kernel NULL pointer dereference,
> > > address: 0000000000000156
> > > [   75.822539] #PF: supervisor read access in kernel mode
> > > [   75.849332] #PF: error_code(0x0000) - not-present page
> > > [   75.862775] PGD 7d012067 P4D 7d012067 PUD 7d013067 PMD 0
> > > [   75.884283] Oops: Oops: 0000 [#1] SMP NOPTI
> > > [   75.902274] CPU: 1 UID: 0 PID: 6476 Comm: helmd Tainted: G
> > >  O       6.12.55+ #1
> > > [   75.928903] Tainted: [O]=OOT_MODULE
> > > [   75.942484] Hardware name: Google Google Compute Engine/Google
> > > Compute Engine, BIOS Google 01/01/2011
> > > [   75.965868] RIP: 0010:submit_bio_noacct+0x21d/0x470
> > > [   75.978340] Code: 8b 73 48 4d 85 f6 74 55 4c 63 25 b6 ad 89 01 49
> > > 83 fc 06 0f 83 44 02 00 00 4f 8b a4 e6 d0 00 00 00 83 3d 09 c9 7d 01
> > > 00 7e 3f <43> 80 bc 3c 56 01 00 00 00 0f 84 28 01 00 00 48 89 df e8 4c
> > > a0 02
> > > [   76.035847] RSP: 0018:ffffa41183463880 EFLAGS: 00010202
> > > [   76.050141] RAX: ffff9d4ec1a81a78 RBX: ffff9d4f3811e6c0 RCX: 00000000009410a0
> > > [   76.065176] RDX: 0000000010300001 RSI: ffff9d4ec1a81a78 RDI: ffff9d4f3811e6c0
> > > [   76.089292] RBP: ffffa411834638b0 R08: 0000000000001000 R09: ffff9d4f3811e6c0
> > > [   76.110878] R10: 2000000000000000 R11: ffffffff8a33e700 R12: 0000000000000000
> > > [   76.139068] R13: ffff9d4ec1422bc0 R14: ffff9d4ec2507000 R15: 0000000000000000
> > > [   76.168391] FS:  0000000008df7f40(0000) GS:ffff9d4f3dd00000(0000)
> > > knlGS:0000000000000000
> > > [   76.179024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [   76.184951] CR2: 0000000000000156 CR3: 000000007d01c006 CR4: 0000000000370ef0
> > > [   76.192352] Call Trace:
> > > [   76.194981]  <TASK>
> > > [   76.197257]  ext4_mpage_readpages+0x75c/0x790
> > > [   76.201794]  read_pages+0xa0/0x250
> > > [   76.205373]  page_cache_ra_unbounded+0xa2/0x1c0
> > > [   76.232608]  filemap_get_pages+0x16b/0x7a0
> > > [   76.254151]  ? srso_alias_return_thunk+0x5/0xfbef5
> > > [   76.260523]  filemap_read+0xf6/0x440
> > > [   76.264540]  do_iter_readv_writev+0x17e/0x1c0
> > > [   76.275427]  vfs_iter_read+0x8a/0x140
> > > [   76.279272]  backing_file_read_iter+0x155/0x250
> > > [   76.284425]  ovl_read_iter+0xd7/0x120
> > > [   76.288270]  ? __pfx_ovl_file_accessed+0x10/0x10
> > > [   76.293069]  vfs_read+0x2b1/0x300
> > > [   76.296835]  ksys_read+0x75/0xe0
> > > [   76.300246]  do_syscall_64+0x61/0x130
> > > [   76.304173]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > >
> > > Our Findings:
> > >
> > > Not an Ext4 regression: We verified that reverting "ext4: reduce stack
> > > usage in ext4_mpage_readpages()" does not resolve the panic.
> > >
> > > Suspected Fix: We suspect upstream commit 18e48d0e2c7b ("ovl: store
> > > upper real file in ovl_file struct") is the correct fix. It seems to
> > > address this exact lifetime race by persistently pinning the
> > > underlying file.
> >
> > That sounds odd.
> > Using a persistent upper real file may be more efficient than opening
> > a temporary file for every read, but the temporary file is a legit opened file,
> > so it looks like you would be averting the race rather than fixing it.
> >
> > Could you try to analyse the conditions that caused the race?
> >
> > >
> > > The Problem: We cannot apply 18e48d0e2c7b to 6.12 stable because it
> > > depends on the extensive ovl_real_file refactoring series (removing
> > > ovl_real_fdget family functions) that landed in 6.13.
> > >
> > > Is there a recommended way to backport the "persistent real file"
> > > logic to 6.12 without pulling in the entire refactor chain?
> > >
> >
> > These are the commits in overlayfs/file.c v6.12..v6.13:
> >
> > $ git log --oneline  v6.12..v6.13 -- fs/overlayfs/file.c
> > d66907b51ba07 ovl: convert ovl_real_fdget() callers to ovl_real_file()
> > 4333e42ed4444 ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()
> > 18e48d0e2c7b1 ovl: store upper real file in ovl_file struct
> > 87a8a76c34a2a ovl: allocate a container struct ovl_file for ovl private context
> > c2c54b5f34f63 ovl: do not open non-data lower file for fsync
> > fc5a1d2287bf2 ovl: use wrapper ovl_revert_creds()
> > 48b50624aec45 backing-file: clean up the API
> >
> > Your claim that 18e48d0e2c7b depends on ovl_real_fdget() is incorrect.
> > You may safely cherry-pick the 4 commits above leading to 18e48d0e2c7b1.
> > They are all self contained changes that would be good to have in 6.12.y,
> > because they would make cherry-picking future fixes easier.
> >
> > Specifically, backing-file: clean up the API, it is better to have the same
> > API in upstream and stable kernels.
> >
> > Thanks,
> > Amir.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ