lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+eFSM2eDRousxuYBOgwPxjcBg1gotsHB2K1JT-eG0uHKtOaWQ@mail.gmail.com>
Date:	Fri, 28 Aug 2015 20:54:04 +0800
From:	Gavin Guo <gavin.guo@...onical.com>
To:	Dave Chinner <david@...morbit.com>
Cc:	xfs@....sgi.com, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: Possible memory allocation deadlock in kmem_alloc and hung task
 in xfs_log_commit_cil and xlog_cil_push

On Wed, Jul 8, 2015 at 7:37 AM, Dave Chinner <david@...morbit.com> wrote:
> On Tue, Jul 07, 2015 at 05:29:43PM +0800, Gavin Guo wrote:
>> Hi all,
>>
>> Recently, we observed that there is the error message in
>> Ubuntu-3.13.0-48.80:
>>
>> "XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250)"
>>
>> repeatedly shows in the dmesg. Temporarily, our workaround is to tune the
>> parameters, such as, vfs_cache_pressure, min_free_kbytes, and dirty_ratio.
>>
>> And we also found that there are different error messages regarding the
>> hung tasks which happened in xfs_log_commit_cil and xlog_cil_push.
>>
>> The log is available at: http://paste.ubuntu.com/11835007/
>>
>> The following link seems the same problem we suffered:
>>
>> XFS hangs with XFS: possible memory allocation deadlock in kmem_alloc
>> http://oss.sgi.com/archives/xfs/2015-03/msg00172.html
>>
>> I read the mail and found that there might be some modification regarding
>> to move the memory allocation outside the ctx lock. And I also read the
>> latest patch from February of 2015 to see if there is any new change
>> about that. Unfortunately, I didn't find anything regarding the change (may
>> be I'm not familiar with the XFS, so didn't find the commit). If it's
>> possible for someone who is familiar with the code to point out the commits
>> related to the bug if already exist or any status about the plan.
>
> No commits - the approach I thought we might be able to take to
> avoid the problem didn't work out. I have another idea of how we
> might solve the problem, but I haven't ad a chance to prototype it
> yet.

I have read the code for a while and still can't figure out how to fix.
My current understanding is that the problem is Buddy system is running out
of memory so the XFS kmem_alloc(),

  called by xfs_log_commit_cil->
                xlog_cil_insert_items->
                xlog_cil_insert_format_items->
                kmem_zalloc,

fail and stuck in the while loop and retry. There are also 2 other threads
running in the same time:

1). xfs_log_commit_cil->down_read(&cil->xc_ctx_lock);

2). xlog_cil_push->down_write(&cil->xc_ctx_lock);

So, the both threads are blocked and waiting for the first kmem_zalloc() to
succeed.

However, if there is a way to decrease the memory request or if it's
possible to elaborate more on the idea you mentioned. I know it's a
problem which cannot be solved in a short time. And I'd like to help if
there is any possibility.

Thanks,
Gavin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ