linux-kernel - Re: dm ioctl: Restore __GFP_HIGH in copy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20170525085827.GH12721@dhcp22.suse.cz>
Date:   Thu, 25 May 2017 10:58:28 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Mikulas Patocka <mpatocka@...hat.com>
Cc:     David Rientjes <rientjes@...gle.com>,
        Mike Snitzer <snitzer@...hat.com>,
        Junaid Shahid <junaids@...gle.com>,
        Alasdair Kergon <agk@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        andreslc@...gle.com, gthelen@...gle.com, vbabka@...e.cz,
        linux-kernel@...r.kernel.org
Subject: Re: dm ioctl: Restore __GFP_HIGH in copy_params()

On Tue 23-05-17 12:44:18, Mikulas Patocka wrote:
> 
> 
> On Tue, 23 May 2017, Michal Hocko wrote:
> 
> > On Mon 22-05-17 13:35:41, David Rientjes wrote:
> > > On Mon, 22 May 2017, Mike Snitzer wrote:
> > [...]
> > > > While adding the __GFP_NOFAIL flag would serve to document expectations
> > > > I'm left unconvinced that the memory allocator will _not fail_ for an
> > > > order-0 page -- as Mikulas said most ioctls don't need more than 4K.
> > > 
> > > __GFP_NOFAIL would make no sense in kvmalloc() calls, ever, it would never 
> > > fallback to vmalloc :)
> > 
> > Sorry, I could have been more specific. You would have to opencode
> > kvmalloc obviously. It is documented to not support this flag for the
> > reasons you have mentioned above.
> > 
> > > I'm hoping this can get merged during the 4.12 window to fix the broken 
> > > commit d224e9381897.
> > 
> > I obviously disagree. Relying on memory reserves for _correctness_ is
> > clearly broken by design, full stop. But it is dm code and you are going
> > it is responsibility of the respective maintainers to support this code.
> 
> Block loop device is broken in the same way - it converts block requests 
> to filesystem reads and writes and those FS reads and writes allocate 
> memory.

I do not see those would depend on the __GFP_HIGH. Also writes are throttled
so the memory shouldn't get full of dirty pages.

> Network block device needs an userspace daemon to perform I/O.

which makes it pretty much not reliable for any forward progress. AFAIR
swap over NBD access full memory reserves to overcome this. But that is
merely an exception

> iSCSI also needs to allocate memory to perform I/O.

Shouldn't it use mempools? I am sorry but I am not familiar with this
area at all.
 
> NFS and other networking filesystems are also broken in the same way (they 
> need to receive a packet to acknowledge a write and packet reception needs 
> to allocate memory).
> 
> So - what should these *broken* drivers do to reduce the possibility of 
> the deadlock?

the IO path has traditionally used mempools to guarantee a forward
progress. If this is not an option then the choice is not all that
great. We are throttling memory writers (or drop packets when the memory
is too low) and finally have the OOM killer to free up some memory. 
-- 
Michal Hocko
SUSE Labs