[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190910091624.3knf6mzorkki67nb@box.shutemov.name>
Date: Tue, 10 Sep 2019 12:16:24 +0300
From: "Kirill A. Shutemov" <kirill@...temov.name>
To: Yu Zhao <yuzhao@...gle.com>
Cc: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: avoid slub allocation while holding list_lock
On Mon, Sep 09, 2019 at 03:39:38PM -0600, Yu Zhao wrote:
> On Tue, Sep 10, 2019 at 05:57:22AM +0900, Tetsuo Handa wrote:
> > On 2019/09/10 1:00, Kirill A. Shutemov wrote:
> > > On Mon, Sep 09, 2019 at 12:10:16AM -0600, Yu Zhao wrote:
> > >> If we are already under list_lock, don't call kmalloc(). Otherwise we
> > >> will run into deadlock because kmalloc() also tries to grab the same
> > >> lock.
> > >>
> > >> Instead, allocate pages directly. Given currently page->objects has
> > >> 15 bits, we only need 1 page. We may waste some memory but we only do
> > >> so when slub debug is on.
> > >>
> > >> WARNING: possible recursive locking detected
> > >> --------------------------------------------
> > >> mount-encrypted/4921 is trying to acquire lock:
> > >> (&(&n->list_lock)->rlock){-.-.}, at: ___slab_alloc+0x104/0x437
> > >>
> > >> but task is already holding lock:
> > >> (&(&n->list_lock)->rlock){-.-.}, at: __kmem_cache_shutdown+0x81/0x3cb
> > >>
> > >> other info that might help us debug this:
> > >> Possible unsafe locking scenario:
> > >>
> > >> CPU0
> > >> ----
> > >> lock(&(&n->list_lock)->rlock);
> > >> lock(&(&n->list_lock)->rlock);
> > >>
> > >> *** DEADLOCK ***
> > >>
> > >> Signed-off-by: Yu Zhao <yuzhao@...gle.com>
> > >
> > > Looks sane to me:
> > >
> > > Acked-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> > >
> >
> > Really?
> >
> > Since page->objects is handled as bitmap, alignment should be BITS_PER_LONG
> > than BITS_PER_BYTE (though in this particular case, get_order() would
> > implicitly align BITS_PER_BYTE * PAGE_SIZE). But get_order(0) is an
> > undefined behavior.
>
> I think we can safely assume PAGE_SIZE is unsigned long aligned and
> page->objects is non-zero.
I think it's better to handle page->objects == 0 gracefully. It should not
happen, but this code handles situation that should not happen.
--
Kirill A. Shutemov
Powered by blists - more mailing lists