lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211202170032.GN5112@ziepe.ca>
Date:   Thu, 2 Dec 2021 13:00:32 -0400
From:   Jason Gunthorpe <jgg@...pe.ca>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Leon Romanovsky <leon@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Bixuan Cui <cuibixuan@...ux.alibaba.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, torvalds@...ux-foundation.org,
        w@....eu, keescook@...omium.org
Subject: Re: [PATCH -next] mm: delete oversized WARN_ON() in kvmalloc() calls

On Thu, Dec 02, 2021 at 03:29:47PM +0000, Matthew Wilcox wrote:
> On Thu, Dec 02, 2021 at 05:23:42PM +0200, Leon Romanovsky wrote:
> > The problem is that this WARN_ON() is triggered by the users.
> 
> ... or the problem is that you don't do a sanity check between the user
> and the MM system.  I mean, that's what this conversation is about --
> is it a bug to be asking for this much memory in the first place?
>
> > At least in the RDMA world, users can provide huge sizes and they expect
> > to get plain -ENOMEM and not dump stack, because it happens indirectly
> > to them.
> > 
> > In our case, these two kvcalloc() generates WARN_ON().
> > 
> > 		umem_odp->pfn_list = kvcalloc(
> > 			npfns, sizeof(*umem_odp->pfn_list), GFP_KERNEL);
> 
> Does it really make sense for the user to specify 2^31 PFNs in a single
> call?  I mean, that's 8TB of memory.  Should RDMA put its own limit
> in here, or should it rely on kvmalloc returning -ENOMEM?

I wrote this - I don't think RDMA should put a limit here. What
limit would it use anyhow?

I'm pretty sure database people are already using low TB's here. It is
not absurd when you have DAX and the biggest user of ODP is with DAX.

If anything we might get to a point in a few years where the 2^31 is
too small and this has to be a better datastructure :\

Maybe an xarray and I should figure out how to use the multi-order
stuff to optimize huge pages?

I'd actually really like to get rid of it, just haven't figured out
how. The only purpose is to call set_page_dirty() and in many cases
the pfn should still be in the mm's page table. We also store another
copy of the PFN in the NIC's mapping. Surely one of these two could do
instead?

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ