[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOQ4uxhhj4k3pVv_AzgNeO1x2uiZKLXdhvXMykM5H-JkgLqC1Q@mail.gmail.com>
Date: Thu, 15 Jan 2026 11:29:03 +0100
From: Amir Goldstein <amir73il@...il.com>
To: Chenglong Tang <chenglongtang@...gle.com>
Cc: viro@...iv.linux.org.uk, brauner@...nel.org, Jan Kara <jack@...e.cz>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
miklos@...redi.hu
Subject: Re: [Regression 6.12] NULL pointer dereference in submit_bio_noacct
via backing_file_read_iter
On Thu, Jan 15, 2026 at 2:04 AM Chenglong Tang <chenglongtang@...gle.com> wrote:
>
> Hi Amir,
>
> Thanks for the suggestion. I followed your advice and cherry-picked
> the 4 recommended commits (plus the backing-file cleanup and a fix for
> it)
Yes, good catch.
> onto 6.12.
>
> However, the system now panics immediately during boot with a NULL
> pointer dereference.
>
> The commit chain applied:
>
> ovl: allocate a container struct ovl_file for ovl private context (87a8a76c34a2)
> ovl: store upper real file in ovl_file struct (18e48d0e2c7b)
> ovl: do not open non-data lower file for fsync (c2c54b5f34f6)
> ovl: use wrapper ovl_revert_creds() (fc5a1d2287bf)
> backing-file: clean up the API (48b50624aec4)
> fs/backing_file: fix wrong argument in callback (2957fa4931a3)
Stange listing the commits out of cherry-pick order.
When you send to stable list, pls send in correct order.
>
> The Crash: The panic occurs in backing_file_read_iter because it
> receives a NULL file pointer from ovl_read_iter.
>
> [ 7.443266] #PF: error_code(0x0000) - not-present page
> [ 7.444208] PGD 0 P4D 0
> [ 7.445270] Oops: Oops: 0000 [#1] SMP PTI
> [ 7.446175] CPU: 0 UID: 0 PID: 423 Comm: sudo Tainted: G
> O 6.12.55+ #1
> [ 7.447669] Tainted: [O]=OOT_MODULE
> [ 7.448330] Hardware name: Google Google Compute Engine/Google
> Compute Engine, BIOS Google 10/25/2025
> [ 7.449825] RIP: 0010:backing_file_read_iter+0x1a/0x250
> [ 7.450810] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> f3 0f 1e fa 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48
> 83 ec 10 <8b> 47 0c a9 00 00 00 02 0f 84 d9 01 00 00 49 89 f6 48 83 7e
> 18 00
> [ 7.453754] RSP: 0018:ffff9e95407b7db0 EFLAGS: 00010282
> [ 7.454694] RAX: 0000000000000000 RBX: ffff9e95407b7e78 RCX: 0000000000000000
> [ 7.455892] RDX: ffff9e95407b7e78 RSI: ffff9e95407b7e50 RDI: 0000000000000000
> [ 7.457158] RBP: ffff9e95407b7de8 R08: ffff9e95407b7df8 R09: 0000000000000001
> [ 7.458331] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [ 7.459593] R13: 0000000000001000 R14: ffff9e95407b7e50 R15: 0000000000000000
> [ 7.460968] FS: 00007a330957cb80(0000) GS:ffff9cb0ac000000(0000)
> knlGS:0000000000000000
> [ 7.463015] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 7.464268] CR2: 000000000000000c CR3: 000000010bfc0003 CR4: 00000000003706f0
> [ 7.465453] Call Trace:
> [ 7.465994] <TASK>
> [ 7.466487] ovl_read_iter+0x9a/0xe0
> [ 7.467424] ? __pfx_ovl_file_accessed+0x10/0x10
> [ 7.468353] vfs_read+0x2b1/0x300
> [ 7.469137] ksys_read+0x75/0xe0
> [ 7.469894] do_syscall_64+0x61/0x130
> [ 7.470603] ? clear_bhb_loop+0x40/0x90
> [ 7.471381] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 7.472486] RIP: 0033:0x7a330967221d
>
> It appears ovl: store upper real file in ovl_file struct introduces a
> bug when backported to 6.12. In ovl_real_fdget_path, the code
> initializes real->word = 0. If ovl_change_flags is called and
> succeeds, it returns 0 immediately. However, because of the early
> return, real->word is never assigned the realfile pointer (which
> happens at the bottom of the function). The caller sees success but
> gets a NULL file pointer.
>
Correct analysis.
There was a mid series regression, but it wasn't made available
in any kernel release.
> I wonder is there an upstream commit that corrects this logic, or does
> this dependency chain require the larger ovl_real_file refactor from
> 6.13 to work correctly?
The upstream commit that fixes the mid series regression is
4333e42ed4444 ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()
It's not a must to apply the entire refactoring to fix the problem, but in fact
the refactoring is a correct logical cleanup following 18e48d0e2c7b,
so I think it is better to include the two refactoring patches in the backports
series rather than diverging from upstream with a custom stable kernel fix.
Please include these two patches in the backports set:
d66907b51ba07 ovl: convert ovl_real_fdget() callers to ovl_real_file()
4333e42ed4444 ovl: convert ovl_real_fdget_path() callers to ovl_real_file_path()
Please send the entire backports set as a patch series to the stable maintainers
or let me know if you want me to do that after you tested the backports.
Thanks,
Amir.
Powered by blists - more mailing lists