linux-kernel - Re: [PATCH v2 3/3] percpu: improve allocation success rate for non-GFP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAAeU0aORY=N0e0gMKu-CBAEF=HLuHUNV6KWy27th1rwuPMcTMg@mail.gmail.com>
Date:   Mon, 27 Feb 2017 12:27:08 -0800
From:   Tahsin Erdogan <tahsin@...gle.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Michal Hocko <mhocko@...nel.org>, Christoph Lameter <cl@...ux.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Chris Wilson <chris@...is-wilson.co.uk>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Roman Pen <r.peniaev@...il.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        zijun_hu <zijun_hu@....com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        David Rientjes <rientjes@...gle.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/3] percpu: improve allocation success rate for
 non-GFP_KERNEL callers

Hi Tejun,

On Mon, Feb 27, 2017 at 11:51 AM, Tejun Heo <tj@...nel.org> wrote:
>> __vmalloc+0x45/0x50
>> pcpu_mem_zalloc+0x50/0x80
>> pcpu_populate_chunk+0x3b/0x380
>> pcpu_alloc+0x588/0x6e0
>> __alloc_percpu_gfp+0xd/0x10
>> __percpu_counter_init+0x55/0xc0
>> blkg_alloc+0x76/0x230
>> blkg_create+0x489/0x670
>> blkg_lookup_create+0x9a/0x230
>> generic_make_request_checks+0x7dd/0x890
>> generic_make_request+0x1f/0x180
>> submit_bio+0x61/0x120
>
> As indicated by GFP_NOWAIT | __GFP_NOWARN, it's okay to fail there.
> It's not okay to fail consistently for a long time but it's not a big
> issue to fail occassionally even if somewhat bunched up.  The only bad
> side effect of that is temporary misaccounting of some IOs, which
> shouldn't be noticeable outside of pathological cases.  If you're
> actually seeing adverse effects of this, I'd love to learn about it.

A better example is the call path below:

pcpu_alloc+0x68f/0x710
__alloc_percpu_gfp+0xd/0x10
__percpu_counter_init+0x55/0xc0
cfq_pd_alloc+0x3b2/0x4e0
blkg_alloc+0x187/0x230
blkg_create+0x489/0x670
blkg_lookup_create+0x9a/0x230
blkg_conf_prep+0x1fb/0x240
__cfqg_set_weight_device.isra.105+0x5c/0x180
cfq_set_weight_on_dfl+0x69/0xc0
cgroup_file_write+0x39/0x1c0
kernfs_fop_write+0x13f/0x1d0
__vfs_write+0x23/0x120
vfs_write+0xc2/0x1f0
SyS_write+0x44/0xb0
entry_SYSCALL_64_fastpath+0x18/0xad

A failure in this call path gives grief to tools which are trying to
configure io
weights. We see occasional failures happen here shortly after reboots even
when system is not under any memory pressure. Machines with a lot of cpus
are obviously more vulnerable.