lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180120020237.GM13338@ZenIV.linux.org.uk>
Date:   Sat, 20 Jan 2018 02:02:37 +0000
From:   Al Viro <viro@...IV.linux.org.uk>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Matthew Wilcox <willy@...radead.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Tony Luck <tony.luck@...el.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Michal Hocko <mhocko@...nel.org>,
        "hillf.zj" <hillf.zj@...baba-inc.com>,
        Hugh Dickins <hughd@...gle.com>,
        Oleg Nesterov <oleg@...hat.com>,
        Rik van Riel <riel@...hat.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure.

On Fri, Jan 19, 2018 at 02:53:25PM -0800, Linus Torvalds wrote:

> It would probably be good to add the size too, just to explain why
> it's potentially expensive.
> 
> That said, apparently we do have hundreds of them, with just
> cpufreq_frequency_table having a ton. Maybe some are hidden in macros
> and removing one removes a lot.

cpufreq_table_find_index_...(), mostly.

> The real problem is that sometimes the subtraction is simply the right
> thing to do, and there's no sane way to say "yeah, this is one of
> those cases you shouldn't warn about".

FWIW, the sizes of the most common ones are
     91 sizeof struct cpufreq_frequency_table = 12
Almost all of those come from
	cpufreq_for_each_valid_entry(pos, table)
		if (....)
			return pos - table;
and I wonder if we would be better off with something like
#define cpufreq_for_each_valid_entry(pos, table, idx)                   		\
        for (pos = table, idx = 0; pos->frequency != CPUFREQ_TABLE_END; pos++, idx++)   \
                if (pos->frequency == CPUFREQ_ENTRY_INVALID)            \
                        continue;                                       \
                else
so that those loops would become
	cpufreq_for_each_valid_entry(pos, table, idx)
		if (....)
			return idx;
     36 sizeof struct Indirect = 24
     21 sizeof struct ips_scb = 216
     18 sizeof struct runlist_element = 24
     13 sizeof struct zone = 1728
Some are from
#define zone_idx(zone)          ((zone) - (zone)->zone_pgdat->node_zones)
but there's
static inline int zone_id(const struct zone *zone)
{ 
        struct pglist_data *pgdat = zone->zone_pgdat;

        return zone - pgdat->node_zones;
}
and a couple of places where we have
        for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
Those bloody well ought to be
	for (zone = node_zones, end = node_zones + MAX_NR_ZONES; zone < end; zone++) {
     13 sizeof struct vring = 40
     11 sizeof struct usbhsh_device = 24
     10 sizeof struct xpc_partition = 888
      9 sizeof struct skge_element = 40
      9 sizeof struct lock_class = 400
      9 sizeof struct hstate = 29872
That little horror comes from get_hstate_idx() and hstate_index().  Code generated
for division is
        movabsq $-5542915600080909725, %rax
        sarq    $4, %rdi
        imulq   %rdi, %rax
      7 sizeof struct nvme_rdma_queue = 312
      7 sizeof struct iso_context = 208
      6 sizeof struct i915_power_well = 48
      6 sizeof struct hpet_dev = 168
      6 sizeof struct ext4_extent = 12
      6 sizeof struct esas2r_target = 120
      5 sizeof struct iio_chan_spec = 152
      5 sizeof struct hwspinlock = 96
      4 sizeof struct myri10ge_slice_state = 704
      4 sizeof struct ext4_extent_idx = 12
Another interesting-looking one is struct vhost_net_virtqueue (18080 bytes)

Note that those sizes are rather sensitive to lockdep, spinlock debugging, etc.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ