lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BBCAB57.3000106@humyo.com>
Date:	Wed, 07 Apr 2010 16:57:11 +0100
From:	John Berthels <john@...yo.com>
To:	Dave Chinner <david@...morbit.com>
CC:	linux-kernel@...r.kernel.org, Nick Gregory <nick@...yo.com>,
	Rob Sanderson <rob@...yo.com>, xfs@....sgi.com
Subject: Re: PROBLEM + POSS FIX: kernel stack overflow, xfs, many disks, heavy
 write load, 8k stack, x86-64

Dave Chinner wrote:
> I'm not seeing stacks deeper than about 5.6k on XFS under heavy write
> loads. That's nowhere near blowing an 8k stack, so there must be
> something special about what you are doing. Can you post the stack
> traces that are being generated for the deepest stack generated -
> /sys/kernel/debug/tracing/stack_trace should contain it.
>   
Appended below. That doesn't seem to reach 8192 but the box it's from 
has logged:

[74649.579386] apache2 used greatest stack depth: 7024 bytes left

full dmesg (gzipped) attached.
> What is generating the write load?
>   

WebDAV PUTs in a modified mogilefs cluster, running apache-mpm-worker 
(threaded) as the DAV server. The write load is a mix of internet-upload 
speed writers trickling files up and some local fast replicators copying 
from elsewhere in the cluster. mpm worker cfg is:

        ServerLimit 20
        StartServers 5
        MaxClients 300
        MinSpareThreads 25
        MaxSpareThreads 75
        ThreadsPerChild 30
        MaxRequestsPerChild 0

File sizes are a mix of small to large (4GB+). Each disk is exported as 
a mogile device, so it's possible for mogile to pound a single disk with 
lots of write activity (if the random number generator decides to put 
lots of files on that device at the same time).

We're also seeing occasional slowdowns + high load avg (up to ~300, i.e. 
MaxClients) with a corresponding number of threads in D state. (This 
slowdown + high load avg seems to correlate with what would have 
previously caused a panic on the THREAD_ORDER 1, but not 100% sure).

As you can see from the dmesg, this trips the "task xxx blocked for more 
than 120 seconds." on some of the threads.

Don't know if that's related to the stack issue or to be expected under 
the load.


jb

        Depth    Size   Location    (47 entries)
        -----    ----   --------
  0)     7568      16   mempool_alloc_slab+0x16/0x20
  1)     7552     144   mempool_alloc+0x65/0x140
  2)     7408      96   get_request+0x124/0x370
  3)     7312     144   get_request_wait+0x29/0x1b0
  4)     7168      96   __make_request+0x9b/0x490
  5)     7072     208   generic_make_request+0x3df/0x4d0
  6)     6864      80   submit_bio+0x7c/0x100
  7)     6784      96   _xfs_buf_ioapply+0x128/0x2c0 [xfs]
  8)     6688      48   xfs_buf_iorequest+0x75/0xd0 [xfs]
  9)     6640      32   _xfs_buf_read+0x36/0x70 [xfs]
 10)     6608      48   xfs_buf_read+0xda/0x110 [xfs]
 11)     6560      80   xfs_trans_read_buf+0x2a7/0x410 [xfs]
 12)     6480      80   xfs_btree_read_buf_block+0x5d/0xb0 [xfs]
 13)     6400      80   xfs_btree_lookup_get_block+0x84/0xf0 [xfs]
 14)     6320     176   xfs_btree_lookup+0xd7/0x490 [xfs]
 15)     6144      16   xfs_alloc_lookup_eq+0x19/0x20 [xfs]
 16)     6128      96   xfs_alloc_fixup_trees+0xee/0x350 [xfs]
 17)     6032     144   xfs_alloc_ag_vextent_near+0x916/0xb30 [xfs]
 18)     5888      32   xfs_alloc_ag_vextent+0xe5/0x140 [xfs]
 19)     5856      96   xfs_alloc_vextent+0x49f/0x630 [xfs]
 20)     5760     160   xfs_bmbt_alloc_block+0xbe/0x1d0 [xfs]
 21)     5600     208   xfs_btree_split+0xb3/0x6a0 [xfs]
 22)     5392      96   xfs_btree_make_block_unfull+0x151/0x190 [xfs]
 23)     5296     224   xfs_btree_insrec+0x39c/0x5b0 [xfs]
 24)     5072     128   xfs_btree_insert+0x86/0x180 [xfs]
 25)     4944     352   xfs_bmap_add_extent_delay_real+0x41e/0x1660 [xfs]
 26)     4592     208   xfs_bmap_add_extent+0x41c/0x450 [xfs]
 27)     4384     448   xfs_bmapi+0x982/0x1200 [xfs]
 28)     3936     256   xfs_iomap_write_allocate+0x248/0x3c0 [xfs]
 29)     3680     208   xfs_iomap+0x3d8/0x410 [xfs]
 30)     3472      32   xfs_map_blocks+0x2c/0x30 [xfs]
 31)     3440     256   xfs_page_state_convert+0x443/0x730 [xfs]
 32)     3184      64   xfs_vm_writepage+0xab/0x160 [xfs]
 33)     3120     384   shrink_page_list+0x65e/0x840
 34)     2736     528   shrink_zone+0x63f/0xe10
 35)     2208     112   do_try_to_free_pages+0xc2/0x3c0
 36)     2096     128   try_to_free_pages+0x77/0x80
 37)     1968     240   __alloc_pages_nodemask+0x3e4/0x710
 38)     1728      48   alloc_pages_current+0x8c/0xe0
 39)     1680      16   __get_free_pages+0xe/0x50
 40)     1664      48   __pollwait+0xca/0x110
 41)     1616      32   unix_poll+0x28/0xc0
 42)     1584      16   sock_poll+0x1d/0x20
 43)     1568     912   do_select+0x3d6/0x700
 44)      656     416   core_sys_select+0x18c/0x2c0
 45)      240     112   sys_select+0x4f/0x110
 46)      128     128   system_call_fastpath+0x16/0x1b


Download attachment "dmesg.txt.gz" of type "application/x-gzip" (18745 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ