lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A42513C.6020607@redhat.com>
Date:	Wed, 24 Jun 2009 11:15:56 -0500
From:	Eric Sandeen <sandeen@...hat.com>
To:	Theodore Tso <tytso@....edu>
CC:	linux-ext4@...r.kernel.org
Subject: Re: Need to potentially watch stack usage for ext4 and AIO...

Theodore Tso wrote:
> On Fri, Jun 19, 2009 at 08:46:12PM -0500, Eric Sandeen wrote:
>> if you got within 372 bytes on 32-bit (with 8k stacks) then that's
>> indeed pretty worrisome.
> 
> Fortunately this was with a 4k stack, but it's still not a good thing;
> the 8k stack also has to support interrupts / soft irq's, whereas the
> CONFIG_4KSTACK has a separate interrupt stack....

Hm I thought we had irq stacks for both but I guess not.

FWIW F11 has gone to 8k stacks ( \o/ )

> Anyone have statistics on what the worst, most evil proprietary
> SCSI/FC/10gigE driver might use in terms of stack space, combined with
> say, the most evil proprietary multipath product at interrupt/softirq
> time, by any chance?

well I'm inclined to ignore the proprietary stuff TBH, we can't control it.

> In any case, here are two stack dumps that I captured, the first using
> a 1k blocksize, and the second using a 4k blocksize (not that the
> blocksize should make a huge amount of difference).  This time, I got
> to within 200 bytes of disaster on the second stack dump.  Worse yet,
> the stack usage bloat isn't in any one place, it seems to be finally
> peanut-buttered across the call stack.
> 
> I can see some things we can do to optimize stack usage; for example,
> struct ext4_allocation_request is allocated on the stack, and the
> structure was laid out without any regard to space wastage caused by
> alignment requirements.  That won't help on x86 at all, but it will
> help substantially on x86_64 (since x86_64 requires that 8 byte
> variables must be 8-byte aligned, where as x86_64 only requires 4 byte
> alignment, even for unsigned long long's).  But it's going have to be
> a whole series of incremental improvements; I don't see any magic
> bullet solution to our stack usage.

XFS forces gcc to not inline any static function; it's extreme, but
maybe it'd help here too.

>        	   					- Ted
> 
> 
>         Depth    Size   Location    (38 entries)
>         -----    ----   --------
>   0)     3064      48   kvm_mmu_write+0x5f/0x67
>   1)     3016      16   kvm_set_pte+0x21/0x27
>   2)     3000     208   __change_page_attr_set_clr+0x272/0x73b

This looks like a victim of inlining

>   3)     2792      76   kernel_map_pages+0xd4/0x102
>   4)     2716      80   get_page_from_freelist+0x2dd/0x3b5
>   5)     2636     108   __alloc_pages_nodemask+0xf6/0x435
>   6)     2528      16   alloc_slab_page+0x20/0x26
>   7)     2512      60   __slab_alloc+0x171/0x470
>   8)     2452       4   kmem_cache_alloc+0x8f/0x127
>   9)     2448      68   radix_tree_preload+0x27/0x66
>  10)     2380      56   cfq_set_request+0xf1/0x2b4
>  11)     2324      16   elv_set_request+0x1c/0x2b
>  12)     2308      44   get_request+0x1b0/0x25f
>  13)     2264      60   get_request_wait+0x1d/0x135
>  14)     2204      52   __make_request+0x24d/0x34e
>  15)     2152      96   generic_make_request+0x28f/0x2d2
>  16)     2056      56   submit_bio+0xb2/0xba
>  17)     2000      20   submit_bh+0xe4/0x101
>  18)     1980     196   ext4_mb_init_cache+0x221/0x8ad

nothing obvious here, maybe inlining

>  19)     1784     232   ext4_mb_regular_allocator+0x443/0xbda

ditto

>  20)     1552      72   ext4_mb_new_blocks+0x1f6/0x46d
>  21)     1480     220   ext4_ext_get_blocks+0xad9/0xc68

ext4_allocation_request is largeish & holey as you said:

struct ext4_allocation_request {
	struct inode *             inode;              /*     0     8 */
	ext4_lblk_t                logical;            /*     8     4 */

	/* XXX 4 bytes hole, try to pack */

	ext4_fsblk_t               goal;               /*    16     8 */
	ext4_lblk_t                lleft;              /*    24     4 */

	/* XXX 4 bytes hole, try to pack */

	ext4_fsblk_t               pleft;              /*    32     8 */
	ext4_lblk_t                lright;             /*    40     4 */

	/* XXX 4 bytes hole, try to pack */

	ext4_fsblk_t               pright;             /*    48     8 */
	unsigned int               len;                /*    56     4 */
	unsigned int               flags;              /*    60     4 */
	/* --- cacheline 1 boundary (64 bytes) --- */

	/* size: 64, cachelines: 1, members: 9 */
	/* sum members: 52, holes: 3, sum holes: 12 */
};



>  22)     1260      68   ext4_get_blocks+0x10e/0x27e
>  23)     1192     244   mpage_da_map_blocks+0xa7/0x720

struct buffer_head new on the stack hurts here, it's 104 bytes.

I really dislike the whole abuse of buffer heads as handy containers for
mapping, we don't need all these fields, I think, but that's a battle
for another day.

>  24)      948     108   ext4_da_writepages+0x27b/0x3d3
>  25)      840      16   do_writepages+0x28/0x39
>  26)      824      72   __writeback_single_inode+0x162/0x333
>  27)      752      68   generic_sync_sb_inodes+0x2b6/0x426
>  28)      684      20   writeback_inodes+0x8a/0xd1
>  29)      664      96   balance_dirty_pages_ratelimited_nr+0x12d/0x237
>  30)      568      92   generic_file_buffered_write+0x173/0x23e
>  31)      476     124   __generic_file_aio_write_nolock+0x258/0x280
>  32)      352      52   generic_file_aio_write+0x6e/0xc2
>  33)      300      52   ext4_file_write+0xa8/0x12c
>  34)      248      36   aio_rw_vect_retry+0x72/0x135
>  35)      212      24   aio_run_iocb+0x69/0xfd
>  36)      188     108   sys_io_submit+0x418/0x4dc
>  37)       80      80   syscall_call+0x7/0xb

<snip>

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ