linux-kernel - Re: Hang in XFS reclaim on 3.7.0-rc3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPVoSvStEdD2uhGQmtb6+qOrme_Cs_AWAuE+dP_XYD5BZyp-kA@mail.gmail.com>
Date:	Tue, 20 Nov 2012 20:45:03 +0100
From:	Torsten Kaiser <just.for.lkml@...glemail.com>
To:	Dave Chinner <david@...morbit.com>
Cc:	xfs@....sgi.com, Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: Hang in XFS reclaim on 3.7.0-rc3

On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner <david@...morbit.com> wrote:
>        [<ffffffff8108137e>] mark_held_locks+0x7e/0x130
>        [<ffffffff81081a63>] lockdep_trace_alloc+0x63/0xc0
>        [<ffffffff810e9dd5>] kmem_cache_alloc+0x35/0xe0
>        [<ffffffff810dba31>] vm_map_ram+0x271/0x770
>        [<ffffffff811e1316>] _xfs_buf_map_pages+0x46/0xe0
>        [<ffffffff811e222a>] xfs_buf_get_map+0x8a/0x130
>        [<ffffffff81233ab9>] xfs_trans_get_buf_map+0xa9/0xd0
>        [<ffffffff8121bced>] xfs_ialloc_inode_init+0xcd/0x1d0
>
> We shouldn't be mapping buffers there, there's a patch below to fix
> this. It's probably the source of this report, even though I cannot
> lockdep seems to be off with the fairies...

That patch seems to break my system.
After it started to swap, because I was compiling seamonkey (firefox
turned into the full navigator suite) on a tmpfs, several processes
got stuck and triggered the hung-task-check.
As a kswapd, xfsaild/md4 and flush-9:4 also got stuck, not even a
shutdown worked.

The attached log first contains the hung-task-notices, then the output
from SysRq+W.

After the shutdown got stuck trying to turn of swap, I first tries
SysRq+S, but did not get a 'Done' and on SysRq+U lockdep complained
about an lock imbalance wrt. sb_writer. SysRq+O also did no longer
work, only SysRq+B.

I don't know which one got stuck first, but I'm somewhat suspicious of
the plasma-desktop and the sshd that SysRq+W reported stuck in xfs
reclaim, even if these processes did never trigger the hung task
check.

Torsten

> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@...morbit.com
>
> xfs: inode allocation should use unmapped buffers.
>
> From: Dave Chinner <dchinner@...hat.com>
>
> Inode buffers do not need to be mapped as inodes are read or written
> directly from/to the pages underlying the buffer. This fixes a
> regression introduced by commit 611c994 ("xfs: make XBF_MAPPED the
> default behaviour").
>
> Signed-off-by: Dave Chinner <dchinner@...hat.com>
> ---
>  fs/xfs/xfs_ialloc.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
> index 2d6495e..a815412 100644
> --- a/fs/xfs/xfs_ialloc.c
> +++ b/fs/xfs/xfs_ialloc.c
> @@ -200,7 +200,8 @@ xfs_ialloc_inode_init(
>                  */
>                 d = XFS_AGB_TO_DADDR(mp, agno, agbno + (j * blks_per_cluster));
>                 fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d,
> -                                        mp->m_bsize * blks_per_cluster, 0);
> +                                        mp->m_bsize * blks_per_cluster,
> +                                        XBF_UNMAPPED);
>                 if (!fbuf)
>                         return ENOMEM;
>                 /*

View attachment "xfs-reclaim-hang-messages.txt" of type "text/plain" (79805 bytes)