lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080405011100.720014000@polaris-admin.engr.sgi.com>
Date:	Fri, 04 Apr 2008 18:11:00 -0700
From:	Mike Travis <travis@....com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: [PATCH 00/12] cpumask: reduce stack pressure from local/passed cpumask variables v3


Modify usage of cpumask_t variables to use pointers as much as possible.

Changes are:

	* Use an allocated array of cpumask_t's for cpumask_of_cpu().
	  This removes > 20,000 bytes of stack usage (see all changes
	  in the chart below),
	  as well as reduces the code generated for each usage.

	* Use set_cpus_allowed_ptr() to pass a pointer to the "newly allowed"
	  cpumask.  This removes > 10,000 bytes of stack usage.

	* Use node_to_cpumask_ptr that returns pointer to cpumask for the
	  specified node.  This removes > 10,000 bytes of stack usage.

	* Modify build_sched_domains and related sub-functions to pass
	  pointers to cpumask temp variables.  This consolidates stack
	  space that was spread over various functions.

	* Remove large array from numa_initmem_init() [> 8,000 bytes].

	* Optimize usages of {CPU,NODE}_MASK_{NONE,ALL} [> 9,000 bytes].

	* Various other changes to reduce stacksize and silence checkpatch
	  warnings [ > 7,000 bytes].

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   x86/latest          .../x86/linux-2.6-x86.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Cc: Cliff Wickman <cpw@....com>
Cc: Dave Jones <davej@...emonkey.org.uk>
Cc: David S. Miller <davem@...emloft.net>
Cc: Greg Banks <gnb@...bourne.sgi.com>
Cc: Greg Kroah-Hartman <gregkh@...e.de>
Cc: H. Peter Anvin <hpa@...or.com>
Cc: Ingo Molnar <mingo@...e.hu>
Cc: Len Brown <len.brown@...el.com>
Cc: Paul Jackson <pj@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: William L. Irwin <wli@...omorphy.com>

Signed-off-by: Mike Travis <travis@....com>
---
v3: rebased on x86/latest + sched-devel/latest
    collapsed many patches so same files don't appear in different
    patches (kernel/sched.c unfortunately still appears in about 5
    or 6 patches, and include/asm-x86/topology.h is in 2.)

v2: resubmitted based on x86/latest.
--- ---------------------------------------------------------
* Memory Usages Changes

Patch list summary of various memory usage changes using the akpm2
config file with NR_CPUS=4096 and MAX_NUMNODES=512.

====== Data (-l 500)
... files 13 vars 1855 all 0 lim 500 unch 0

    1 - initial
    2 - cpumask_of_cpu
    3 - add-CPUMASK_ALL_PTR
    5 - nr_cpus-in-kernel_sched
    7 - generic-set_cpus_allowed
   12 - kernel_sched_c
   13 - use-CPUMASK_ALL_PTR

    .1.   .2.   .3.     .5.   .7.  .12.  .13.    ..final..
  32768     .     .  -32768     .     .     .    .  -100%  sched_group_nodes_bycpu(.bss)
  32768     .     .  -32768     .     .     .    .  -100%  init_sched_rt_entity_p(.bss)
  32768     .     .  -32768     .     .     .    .  -100%  init_rt_rq_p(.bss)
   3550     .     .      +9     .  -842     . 2717   -23%  build_sched_domains(.text)
    674   -95     .       .  -579     .     .    .  -100%  acpi_processor_set_throttling(.text)
    533  -533     .       .     .     .     .    .  -100%  hpet_enable(.init.text)
    512     .     .       .     .     .  -512    .  -100%  C(.rodata)
      0     .  +512       .     .     .     .  512      .  cpu_mask_all(.data.read_mostly)
 103573  -628  +512  -98295  -579  -842  -512 3229   -96%  Totals

====== Sections (-l 500)
... files 13 vars 37 all 0 lim 500 unch 0

    1 - initial
    2 - cpumask_of_cpu
    3 - add-CPUMASK_ALL_PTR
    5 - nr_cpus-in-kernel_sched
    6 - x86-set_cpus_allowed
    7 - generic-set_cpus_allowed
    8 - cpuset_cpus_allowed
    9 - cpumask_affinity
   10 - numa_initmem_init
   11 - node_to_cpumask_ptr
   12 - kernel_sched_c
   13 - use-CPUMASK_ALL_PTR

       .1.     .2.   .3.     .5.    .6.    .7.   .8.   .9.   .10.  .11.   .12.   .13.    ..final..
  75833950  -13010  +260  -98051  -2972  -2295  +644  +122  +8174  -465   -698  +1349 75727008    <1%  Total
  42810838   -7428   +39    +225   -757   -431  +215  +182     -1  -366   +945   +591 42804052    <1%  .debug_info
   6808830    -404  -211    -116   -316   -243   +42  +113    -22   +95  +1950    +14  6809732    <1%  .debug_loc
   4805176     +16  +512       .      .      .     .     .      .     .      .      .  4805704    <1%  .data.read_mostly
   3475017    -528   -16     +48   -128   -496  -256  -112      .  -496    -32    -80  3472921    <1%  .text
   2720422    -895   -16     +26   -192   -146   -23    -4     -1  -164    +26    +81  2719114    <1%  .debug_line
   1775040       .     .  -98176      .      .     .     .      .     .      .      .  1676864    -5%  .bss
   1395188    -245    -9       .   -131    -95   -19   -17      .     .    +14   +103  1394789    <1%  .debug_abbrev
   1141392   -3104   -64       .  -1408   -848  +752   -16      .  +480   +208   +640  1138032    <1%  .debug_ranges
   1021159     +32  -512       .      .  -1152  -640     .      .     .  -1377   -512  1016998    <1%  .rodata
    982688       .     .       .      .      .     .     .  +8208     .      .      .   990896    <1%  .init.data
      8080     -80  +464       .      .  +1152  +640     .      .     .  -2720   +512     8048    <1%  __param
142777780 -25646 +447 -196044 -5904 -4554 +1355 +268 +16358 -916 -1684 +2698 142564158   +0%  Totals

====== Text/Data ()
... files 13 vars 6 all 0 lim 0 unch 0

    1 - initial
    2 - cpumask_of_cpu
    5 - nr_cpus-in-kernel_sched
    7 - generic-set_cpus_allowed
   10 - numa_initmem_init
   12 - kernel_sched_c

       .1.    .2.     .5.    .7.   .10.   .12.    ..final..
   3475456  -2048       .      .      .      . 3473408    <1%  TextSize
   1738752  +2048       .  -4096      .  -4096 1732608    <1%  DataSize
   1775616      .  -98304      .      .      . 1677312    -5%  BssSize
   1220608      .       .      .  +8192      . 1228800    <1%  InitSize
  10399744      .       .  -4096      .  +4096 10399744      .  OtherSize
 18610176     . -98304 -8192 +8192     . 18511872   +0%  Totals

====== PerCPU ()
... files 13 vars 10 all 0 lim 0 unch 0

    1 - initial

  .1.    ..final..
   0 .   +0%  Totals

====== Stack (-l 500)
... files 13 vars 166 all 0 lim 500 unch 0

    1 - initial
    2 - cpumask_of_cpu
    3 - add-CPUMASK_ALL_PTR
    5 - nr_cpus-in-kernel_sched
    6 - x86-set_cpus_allowed
    7 - generic-set_cpus_allowed
    8 - cpuset_cpus_allowed
    9 - cpumask_affinity
   10 - numa_initmem_init
   11 - node_to_cpumask_ptr
   12 - kernel_sched_c
   13 - use-CPUMASK_ALL_PTR

    .1.    .2.    .3.    .5.   .6.    .7.   .8.    .9.   .10.   .11.   .12.  .13.    ..final..
  11080      .      .      .     .      .     .      .      .   -512  -8336     . 2232   -79%  build_sched_domains
   8248      .      .      .     .      .     .      .  -8248      .      .     .    .  -100%  numa_initmem_init
   4648      .      .  -3840     .      .     .      .      .      .   -808     .    .  -100%  cpu_attach_domain
   3176      .    -16    +16     .      .     .      .      .  -1552  -1024     .  600   -81%  sched_domain_node_span
   3176   -512      .      .  -512      .     .      .      .      .      .     . 2152   -32%  centrino_target
   2584  -1024      .      .     .   -512     .      .      .      .      .     . 1048   -59%  acpi_processor_set_throttling
   2104      .      .      .     .  -1024     .      .      .      .      .     . 1080   -48%  _cpu_down
   2088  -1024      .      .  -512      .     .      .      .      .      .     .  552   -73%  powernowk8_cpu_init
   2072   -512      .      .     .      .     .      .      .      .      .     . 1560   -24%  tick_notify
   2056      .      .      .     .      .     .  -2056      .      .      .     .    .  -100%  affinity_set
   1784  -1024      .      .     .      .     .      .      .      .      .     .  760   -57%  cpufreq_add_dev
   1704      .      .      .     .      .     .      .      .  -1704      .     .    .  -100%  kswapd
   1608   -512      .      .  -512      .     .      .      .      .      .     .  584   -63%  powernowk8_target
   1608  -1608      .      .     .      .     .      .      .      .      .     .    .  -100%  disable_smp
   1608   -512      .      .  -512      .     .      .      .      .      .     .  584   -63%  cache_add_dev
   1592      .      .      .     .      .     .      .      .      .  -1592     .    .  -100%  do_tune_cpucache
   1576      .      .      .     .      .     .      .      .      .  -1576     .    .  -100%  init_sched_build_groups
   1560      .      .      .     .  -1040     .      .      .      .      .     .  520   -66%  pci_device_probe
   1560   -512      .      .  -512      .     .      .      .      .      .     .  536   -65%  check_supported_cpu
   1544      .      .      .     .      .  -512   +512      .      .   -512     . 1032   -33%  sched_setaffinity
   1544   -512      .      .  -520      .     .      .      .      .      .     .  512   -66%  powernowk8_get
   1544  -1008      .      .     .      .     .      .      .      .      .     .  536   -65%  alloc_ldt
   1536   -504      .      .     .      .     .      .      .      .      .     . 1032   -32%  smp_call_function_single
   1536  -1024      .      .     .      .     .      .      .      .      .     .  512   -66%  native_smp_send_reschedule
   1536   -512      .      .  -504      .     .      .      .      .      .     .  520   -66%  get_cur_freq
   1536   -512      .      .     .   -512     .      .      .      .      .     .  512   -66%  acpi_processor_get_throttling
   1536   -512      .      .  -504      .     .      .      .      .      .     .  520   -66%  acpi_processor_ffh_cstate_probe
   1176      .      .      .     .      .     .      .      .      .   -512     .  664   -43%  thread_return
   1176      .      .      .     .      .     .      .      .      .   -512     .  664   -43%  schedule
   1160      .      .      .     .      .     .      .      .      .   -512     .  648   -44%  run_rebalance_domains
   1160      .      .      .     .      .     .      .      .  -1160      .     .    .  -100%  __build_all_zonelists
   1152      .      .      .     .      .  -504      .      .      .      .     .  648   -43%  cpuset_attach
   1072      .      .      .     .      .  -512      .      .      .      .     .  560   -47%  pdflush
   1064      .      .      .     .      .     .      .      .  -1064      .     .    .  -100%  cpuup_canceled
   1064      .      .      .     .      .     .      .      .      .  -1064     .    .  -100%  cpuup_callback
   1032  -1032      .      .     .      .     .      .      .      .      .     .    .  -100%  uv_target_cpus
   1032      .      .      .     .   -520     .      .      .      .      .     .  512   -50%  system_kthread_notifier
   1032  -1032      .      .     .      .     .      .      .      .      .     .    .  -100%  setup_pit_timer
   1032      .      .      .     .      .     .      .      .      .   -512     .  520   -49%  sched_init_smp
   1032      .      .      .     .      .     .      .      .      .      .  -520  512   -50%  physflat_vector_allocation_domain
   1032      .  -1032      .     .      .     .      .      .      .      .     .    .  -100%  kernel_init
   1032  -1032      .      .     .      .     .      .      .      .      .     .    .  -100%  init_workqueues
   1032  -1032      .      .     .      .     .      .      .      .      .     .    .  -100%  init_idle
   1032      .      .      .     .      .     .      .      .      .      .  -512  520   -49%  destroy_irq
   1032      .      .      .     .  -1032     .      .      .      .      .     .    .  -100%  ____call_usermodehelper
   1024      .      .      .     .      .     .   -512      .      .      .     .  512   -50%  sys_sched_setaffinity
   1024   -504      .      .     .   -520     .      .      .      .      .     .    .  -100%  stopmachine
   1024  -1024      .      .     .      .     .      .      .      .      .     .    .  -100%  setup_APIC_timer
   1024  -1024      .      .     .      .     .      .      .      .      .     .    .  -100%  native_smp_prepare_cpus
   1024   -504      .      .  -520      .     .      .      .      .      .     .    .  -100%  native_machine_shutdown
   1024      .      .      .     .  -1024     .      .      .      .      .     .    .  -100%  kthreadd
   1024  -1024      .      .     .      .     .      .      .      .      .     .    .  -100%  kthread_bind
   1024  -1024      .      .     .      .     .      .      .      .      .     .    .  -100%  hpet_enable
   1024      .      .      .     .      .     .   -512      .      .      .     .  512   -50%  compat_sys_sched_setaffinity
   1024      .      .      .     .      .     .      .      .      .      .  -512  512   -50%  __percpu_populate_mask
    576      .      .      .     .      .  -576      .      .      .      .     .    .  -100%  cpuset_init
    576      .      .      .     .      .  -576      .      .      .      .     .    .  -100%  cpuset_create
    552      .      .      .     .      .     .      .      .      .   -552     .    .  -100%  migration_call
    520      .      .      .     .      .     .      .      .   -520      .     .    .  -100%  node_read_cpumap
    520      .      .      .     .      .     .      .      .      .      .  -520    .  -100%  dynamic_irq_init
    520      .      .      .     .      .  -520      .      .      .      .     .    .  -100%  cpuset_cpus_allowed
    520      .      .      .     .      .  -520      .      .      .      .     .    .  -100%  cpuset_change_cpumask
    520      .      .      .     .      .     .      .      .      .   -520     .    .  -100%  cpu_to_phys_group
    520      .      .      .     .      .     .      .      .      .   -520     .    .  -100%  cpu_to_core_group
    520      .      .      .     .      .     .   -520      .      .      .     .    .  -100%  affinity_restore
    512      .      .      .     .      .  -512      .      .      .      .     .    .  -100%  cpuset_cpus_allowed_locked
      0      .      .      .     .      .     .      .      .      .   +760     .  760      .  sd_init_SIBLING
      0      .      .      .     .      .     .      .      .      .   +760     .  760      .  sd_init_NODE
      0      .      .      .     .      .     .      .      .      .   +752     .  752      .  sd_init_MC
      0      .      .      .     .      .     .      .      .      .   +752     .  752      .  sd_init_CPU
      0      .      .      .     .      .     .      .      .      .   +752     .  752      .  sd_init_ALLNODES
      0      .      .      .     .      .     .      .      .      .   +512     .  512      .  detach_destroy_domains
103584 -21056 -1048 -3824 -4608 -6184 -4232 -3088 -8248 -6512 -14264 -2064 28456  -72%  Totals

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ