linux-kernel - Re: [PATCH] fs: fix schedule while atomic caused by gfp of erofs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAGWkznEpn0NNTiYL-VYohcmboQ-kTDssiGZyi84BXf5i8+KA-Q@mail.gmail.com>
Date: Tue, 16 Jul 2024 14:14:24 +0800
From: Zhaoyang Huang <huangzhaoyang@...il.com>
To: Gao Xiang <hsiangkao@...ux.alibaba.com>
Cc: "zhaoyang.huang" <zhaoyang.huang@...soc.com>, Gao Xiang <xiang@...nel.org>, 
	Chao Yu <chao@...nel.org>, Yue Hu <huyue2@...lpad.com>, 
	Jeffle Xu <jefflexu@...ux.alibaba.com>, Sandeep Dhavale <dhavale@...gle.com>, 
	linux-erofs@...ts.ozlabs.org, linux-kernel@...r.kernel.org, 
	steve.kang@...soc.com
Subject: Re: [PATCH] fs: fix schedule while atomic caused by gfp of erofs_allocpage

On Tue, Jul 16, 2024 at 1:50 PM Gao Xiang <hsiangkao@...ux.alibaba.com> wrote:
>
>
>
> On 2024/7/16 13:44, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@...soc.com>
> >
> > scheduling while atomic was reported as below where the schedule_timeout
> > comes from too_many_isolated when doing direct_reclaim. Fix this by
> > masking GFP_DIRECT_RECLAIM from gfp.
> >
> > [  175.610416][  T618] BUG: scheduling while atomic: kworker/u16:6/618/0x00000000
> > [  175.643480][  T618] CPU: 2 PID: 618 Comm: kworker/u16:6 Tainted: G
> > [  175.645791][  T618] Workqueue: loop20 loop_workfn
> > [  175.646394][  T618] Call trace:
> > [  175.646785][  T618]  dump_backtrace+0xf4/0x140
> > [  175.647345][  T618]  show_stack+0x20/0x2c
> > [  175.647846][  T618]  dump_stack_lvl+0x60/0x84
> > [  175.648394][  T618]  dump_stack+0x18/0x24
> > [  175.648895][  T618]  __schedule_bug+0x64/0x90
> > [  175.649445][  T618]  __schedule+0x680/0x9b8
> > [  175.649970][  T618]  schedule+0x130/0x1b0
> > [  175.650470][  T618]  schedule_timeout+0xac/0x1d0
> > [  175.651050][  T618]  schedule_timeout_uninterruptible+0x24/0x34
> > [  175.651789][  T618]  __alloc_pages_slowpath+0x8dc/0x121c
> > [  175.652455][  T618]  __alloc_pages+0x294/0x2fc
> > [  175.653011][  T618]  erofs_allocpage+0x48/0x58
> > [  175.653572][  T618]  z_erofs_runqueue+0x314/0x8a4
> > [  175.654161][  T618]  z_erofs_readahead+0x258/0x318
> > [  175.654761][  T618]  read_pages+0x88/0x394
> > [  175.655275][  T618]  page_cache_ra_unbounded+0x1cc/0x23c
> > [  175.655939][  T618]  page_cache_ra_order+0x27c/0x33c
> > [  175.656559][  T618]  ondemand_readahead+0x224/0x334
> > [  175.657169][  T618]  page_cache_async_ra+0x60/0x9c
> > [  175.657767][  T618]  filemap_get_pages+0x19c/0x7cc
> > [  175.658367][  T618]  filemap_read+0xf0/0x484
> > [  175.658901][  T618]  generic_file_read_iter+0x4c/0x15c
> > [  175.659543][  T618]  do_iter_read+0x224/0x348
> > [  175.660100][  T618]  vfs_iter_read+0x24/0x38
> > [  175.660635][  T618]  loop_process_work+0x408/0xa68
> > [  175.661236][  T618]  loop_workfn+0x28/0x34
> > [  175.661751][  T618]  process_scheduled_works+0x254/0x4e8
> > [  175.662417][  T618]  worker_thread+0x24c/0x33c
> > [  175.662974][  T618]  kthread+0x110/0x1b8
> > [  175.663465][  T618]  ret_from_fork+0x10/0x20
> >
> > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
>
> I don't see why it's an atomic context,
> so this patch is incorrect.
Sorry, I should provide more details. page_cache_ra_unbounded() will
call filemap_invalidate_lock_shared(mapping) to ensure the integrity
of page cache during readahead, which will disable preempt.
>
> Thanks,
> Gao Xiang