linux-kernel - Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim and use a

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 16 Jun 2010 01:08:00 +1000
From:	Nick Piggin <npiggin@...e.de>
To:	Mel Gorman <mel@....ul.ie>
Cc:	Andrea Arcangeli <aarcange@...hat.com>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, Dave Chinner <david@...morbit.com>,
	Chris Mason <chris.mason@...cle.com>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [RFC PATCH 0/6] Do not call ->writepage[s] from direct reclaim
 and use a_ops->writepages() where possible

On Tue, Jun 15, 2010 at 03:51:34PM +0100, Mel Gorman wrote:
> On Tue, Jun 15, 2010 at 04:00:11PM +0200, Andrea Arcangeli wrote:
> > When memory pressure is low, not going into ->writepage may be
> > beneficial from latency prospective too. (but again it depends how
> > much it matters to go in LRU and how beneficial is the cache, to know
> > if it's worth taking clean cache away even if hotter than dirty cache)
> > 
> > About the stack overflow did you ever got any stack-debug error?
> 
> Not an error. Got a report from Dave Chinner though and it's what kicked
> off this whole routine in the first place. I've been recording stack
> usage figures but not reporting them. In reclaim I'm getting to about 5K
> deep but this was on simple storage and XFS was ignoring attempts for
> reclaim to writeback.
> 
> http://lkml.org/lkml/2010/4/13/121
> 
> Here is one my my own stack traces though
> 
>         Depth    Size   Location    (49 entries)
>         -----    ----   --------
>   0)     5064     304   get_page_from_freelist+0x2e4/0x722
>   1)     4760     240   __alloc_pages_nodemask+0x15f/0x6a7
>   2)     4520      48   kmem_getpages+0x61/0x12c
>   3)     4472      96   cache_grow+0xca/0x272
>   4)     4376      80   cache_alloc_refill+0x1d4/0x226
>   5)     4296      64   kmem_cache_alloc+0x129/0x1bc
>   6)     4232      16   mempool_alloc_slab+0x16/0x18
>   7)     4216     144   mempool_alloc+0x56/0x104
>   8)     4072      16   scsi_sg_alloc+0x48/0x4a [scsi_mod]
>   9)     4056      96   __sg_alloc_table+0x58/0xf8
>  10)     3960      32   scsi_init_sgtable+0x37/0x8f [scsi_mod]
>  11)     3928      32   scsi_init_io+0x24/0xce [scsi_mod]
>  12)     3896      48   scsi_setup_fs_cmnd+0xbc/0xc4 [scsi_mod]
>  13)     3848     144   sd_prep_fn+0x1d3/0xc13 [sd_mod]
>  14)     3704      64   blk_peek_request+0xe2/0x1a6
>  15)     3640      96   scsi_request_fn+0x87/0x522 [scsi_mod]
>  16)     3544      32   __blk_run_queue+0x88/0x14b
>  17)     3512      48   elv_insert+0xb7/0x254
>  18)     3464      48   __elv_add_request+0x9f/0xa7
>  19)     3416     128   __make_request+0x3f4/0x476
>  20)     3288     192   generic_make_request+0x332/0x3a4
>  21)     3096      64   submit_bio+0xc4/0xcd
>  22)     3032      80   _xfs_buf_ioapply+0x222/0x252 [xfs]
>  23)     2952      48   xfs_buf_iorequest+0x84/0xa1 [xfs]
>  24)     2904      32   xlog_bdstrat+0x47/0x4d [xfs]
>  25)     2872      64   xlog_sync+0x21a/0x329 [xfs]
>  26)     2808      48   xlog_state_release_iclog+0x9b/0xa8 [xfs]
>  27)     2760     176   xlog_write+0x356/0x506 [xfs]
>  28)     2584      96   xfs_log_write+0x5a/0x86 [xfs]
>  29)     2488     368   xfs_trans_commit_iclog+0x165/0x2c3 [xfs]
>  30)     2120      80   _xfs_trans_commit+0xd8/0x20d [xfs]
>  31)     2040     240   xfs_iomap_write_allocate+0x247/0x336 [xfs]
>  32)     1800     144   xfs_iomap+0x31a/0x345 [xfs]
>  33)     1656      48   xfs_map_blocks+0x3c/0x40 [xfs]
>  34)     1608     256   xfs_page_state_convert+0x2c4/0x597 [xfs]
>  35)     1352      64   xfs_vm_writepage+0xf5/0x12f [xfs]
>  36)     1288      32   __writepage+0x17/0x34
>  37)     1256     288   write_cache_pages+0x1f3/0x2f8
>  38)      968      16   generic_writepages+0x24/0x2a
>  39)      952      64   xfs_vm_writepages+0x4f/0x5c [xfs]
>  40)      888      16   do_writepages+0x21/0x2a
>  41)      872      48   writeback_single_inode+0xd8/0x2f4
>  42)      824     112   writeback_inodes_wb+0x41a/0x51e
>  43)      712     176   wb_writeback+0x13d/0x1b7
>  44)      536     128   wb_do_writeback+0x150/0x167
>  45)      408      80   bdi_writeback_task+0x43/0x117
>  46)      328      48   bdi_start_fn+0x76/0xd5
>  47)      280      96   kthread+0x82/0x8a
>  48)      184     184   kernel_thread_helper+0x4/0x10
> 
> XFS as you can see is quite deep there. Now consider if
> get_page_from_freelist() there had entered direct reclaim and then tried
> to writeback a page. That's the problem that is being worried about.

It would be a problem because it should be !__GFP_IO at that point so
something would be seriously broken if it called ->writepage again.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/