lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 20 Jun 2009 20:49:19 -0400
From:	Theodore Tso <tytso@....edu>
To:	Eric Sandeen <sandeen@...hat.com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: Need to potentially watch stack usage for ext4 and AIO...

On Fri, Jun 19, 2009 at 08:46:12PM -0500, Eric Sandeen wrote:
> if you got within 372 bytes on 32-bit (with 8k stacks) then that's
> indeed pretty worrisome.

Fortunately this was with a 4k stack, but it's still not a good thing;
the 8k stack also has to support interrupts / soft irq's, whereas the
CONFIG_4KSTACK has a separate interrupt stack....

Anyone have statistics on what the worst, most evil proprietary
SCSI/FC/10gigE driver might use in terms of stack space, combined with
say, the most evil proprietary multipath product at interrupt/softirq
time, by any chance?

In any case, here are two stack dumps that I captured, the first using
a 1k blocksize, and the second using a 4k blocksize (not that the
blocksize should make a huge amount of difference).  This time, I got
to within 200 bytes of disaster on the second stack dump.  Worse yet,
the stack usage bloat isn't in any one place, it seems to be finally
peanut-buttered across the call stack.

I can see some things we can do to optimize stack usage; for example,
struct ext4_allocation_request is allocated on the stack, and the
structure was laid out without any regard to space wastage caused by
alignment requirements.  That won't help on x86 at all, but it will
help substantially on x86_64 (since x86_64 requires that 8 byte
variables must be 8-byte aligned, where as x86_64 only requires 4 byte
alignment, even for unsigned long long's).  But it's going have to be
a whole series of incremental improvements; I don't see any magic
bullet solution to our stack usage.

       	   					- Ted


        Depth    Size   Location    (38 entries)
        -----    ----   --------
  0)     3064      48   kvm_mmu_write+0x5f/0x67
  1)     3016      16   kvm_set_pte+0x21/0x27
  2)     3000     208   __change_page_attr_set_clr+0x272/0x73b
  3)     2792      76   kernel_map_pages+0xd4/0x102
  4)     2716      80   get_page_from_freelist+0x2dd/0x3b5
  5)     2636     108   __alloc_pages_nodemask+0xf6/0x435
  6)     2528      16   alloc_slab_page+0x20/0x26
  7)     2512      60   __slab_alloc+0x171/0x470
  8)     2452       4   kmem_cache_alloc+0x8f/0x127
  9)     2448      68   radix_tree_preload+0x27/0x66
 10)     2380      56   cfq_set_request+0xf1/0x2b4
 11)     2324      16   elv_set_request+0x1c/0x2b
 12)     2308      44   get_request+0x1b0/0x25f
 13)     2264      60   get_request_wait+0x1d/0x135
 14)     2204      52   __make_request+0x24d/0x34e
 15)     2152      96   generic_make_request+0x28f/0x2d2
 16)     2056      56   submit_bio+0xb2/0xba
 17)     2000      20   submit_bh+0xe4/0x101
 18)     1980     196   ext4_mb_init_cache+0x221/0x8ad
 19)     1784     232   ext4_mb_regular_allocator+0x443/0xbda
 20)     1552      72   ext4_mb_new_blocks+0x1f6/0x46d
 21)     1480     220   ext4_ext_get_blocks+0xad9/0xc68
 22)     1260      68   ext4_get_blocks+0x10e/0x27e
 23)     1192     244   mpage_da_map_blocks+0xa7/0x720
 24)      948     108   ext4_da_writepages+0x27b/0x3d3
 25)      840      16   do_writepages+0x28/0x39
 26)      824      72   __writeback_single_inode+0x162/0x333
 27)      752      68   generic_sync_sb_inodes+0x2b6/0x426
 28)      684      20   writeback_inodes+0x8a/0xd1
 29)      664      96   balance_dirty_pages_ratelimited_nr+0x12d/0x237
 30)      568      92   generic_file_buffered_write+0x173/0x23e
 31)      476     124   __generic_file_aio_write_nolock+0x258/0x280
 32)      352      52   generic_file_aio_write+0x6e/0xc2
 33)      300      52   ext4_file_write+0xa8/0x12c
 34)      248      36   aio_rw_vect_retry+0x72/0x135
 35)      212      24   aio_run_iocb+0x69/0xfd
 36)      188     108   sys_io_submit+0x418/0x4dc
 37)       80      80   syscall_call+0x7/0xb

-------------

        Depth    Size   Location    (47 entries)
        -----    ----   --------
  0)     3556       8   kvm_clock_read+0x1b/0x1d
  1)     3548       8   sched_clock+0x8/0xb
  2)     3540      96   __lock_acquire+0x1c0/0xb21
  3)     3444      44   lock_acquire+0x94/0xb7
  4)     3400      16   _spin_lock_irqsave+0x37/0x6a
  5)     3384      28   clocksource_get_next+0x12/0x48
  6)     3356      96   update_wall_time+0x661/0x740
  7)     3260       8   do_timer+0x1b/0x22
  8)     3252      44   tick_do_update_jiffies64+0xed/0x127
  9)     3208      24   tick_sched_timer+0x47/0xa0
 10)     3184      40   __run_hrtimer+0x67/0x97
 11)     3144      56   hrtimer_interrupt+0xfe/0x151
 12)     3088      16   smp_apic_timer_interrupt+0x6f/0x82
 13)     3072      92   apic_timer_interrupt+0x2f/0x34
 14)     2980      48   kvm_mmu_write+0x5f/0x67
 15)     2932      16   kvm_set_pte+0x21/0x27
 16)     2916     208   __change_page_attr_set_clr+0x272/0x73b
 17)     2708      76   kernel_map_pages+0xd4/0x102
 18)     2632      32   free_hot_cold_page+0x74/0x1bc
 19)     2600      20   __pagevec_free+0x22/0x2a
 20)     2580     168   shrink_page_list+0x542/0x61a
 21)     2412     168   shrink_list+0x26a/0x50b
 22)     2244      96   shrink_zone+0x211/0x2a7
 23)     2148     116   try_to_free_pages+0x1db/0x2f3
 24)     2032      92   __alloc_pages_nodemask+0x2ab/0x435
 25)     1940      40   find_or_create_page+0x43/0x79
 26)     1900      84   __getblk+0x13a/0x2de
 27)     1816     164   ext4_ext_insert_extent+0x853/0xb56
 28)     1652     224   ext4_ext_get_blocks+0xb27/0xc68
 29)     1428      68   ext4_get_blocks+0x10e/0x27e
 30)     1360     244   mpage_da_map_blocks+0xa7/0x720
 31)     1116      32   __mpage_da_writepage+0x35/0x158
 32)     1084     132   write_cache_pages+0x1b1/0x293
 33)      952     112   ext4_da_writepages+0x262/0x3d3
 34)      840      16   do_writepages+0x28/0x39
 35)      824      72   __writeback_single_inode+0x162/0x333
 36)      752      68   generic_sync_sb_inodes+0x2b6/0x426
 37)      684      20   writeback_inodes+0x8a/0xd1
 38)      664      96   balance_dirty_pages_ratelimited_nr+0x12d/0x237
 39)      568      92   generic_file_buffered_write+0x173/0x23e
 40)      476     124   __generic_file_aio_write_nolock+0x258/0x280
 41)      352      52   generic_file_aio_write+0x6e/0xc2
 42)      300      52   ext4_file_write+0xa8/0x12c
 43)      248      36   aio_rw_vect_retry+0x72/0x135
 44)      212      24   aio_run_iocb+0x69/0xfd
 45)      188     108   sys_io_submit+0x418/0x4dc
 46)       80      80   syscall_call+0x7/0xb
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ