linux-kernel - Re: mm: lru_add_drain

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5516A15E.4080500@suse.cz>
Date:	Sat, 28 Mar 2015 13:41:02 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Sasha Levin <sasha.levin@...cle.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>
CC:	Johannes Weiner <hannes@...xchg.org>, Mel Gorman <mgorman@...e.de>,
	Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: mm: lru_add_drain_all hangs

On 27.3.2015 22:36, Sasha Levin wrote:
> On 03/27/2015 06:07 AM, Vlastimil Babka wrote:
>>> [ 3614.918852] trinity-c7      D ffff8802f4487b58 26976 16252   9410 0x10000000
>>>> [ 3614.919580]  ffff8802f4487b58 ffff8802f6b98ca8 0000000000000000 0000000000000000
>>>> [ 3614.920435]  ffff88017d3e0558 ffff88017d3e0530 ffff8802f6b98008 ffff88016bad0000
>>>> [ 3614.921219]  ffff8802f6b98000 ffff8802f4487b38 ffff8802f4480000 ffffed005e890002
>>>> [ 3614.922069] Call Trace:
>>>> [ 3614.922346] schedule (./arch/x86/include/asm/bitops.h:311 (discriminator 1) kernel/sched/core.c:2827 (discriminator 1))
>>>> [ 3614.923023] schedule_preempt_disabled (kernel/sched/core.c:2859)
>>>> [ 3614.923707] mutex_lock_nested (kernel/locking/mutex.c:585 kernel/locking/mutex.c:623)
>>>> [ 3614.924486] ? lru_add_drain_all (mm/swap.c:867)
>>>> [ 3614.925211] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2580 kernel/locking/lockdep.c:2622)
>>>> [ 3614.925970] ? lru_add_drain_all (mm/swap.c:867)
>>>> [ 3614.926692] ? mutex_trylock (kernel/locking/mutex.c:621)
>>>> [ 3614.927464] ? mpol_new (mm/mempolicy.c:285)
>>>> [ 3614.928044] lru_add_drain_all (mm/swap.c:867)
>>>> [ 3614.928608] migrate_prep (mm/migrate.c:64)
>>>> [ 3614.929092] SYSC_mbind (mm/mempolicy.c:1188 mm/mempolicy.c:1319)
>>>> [ 3614.929619] ? rcu_eqs_exit_common (kernel/rcu/tree.c:735 (discriminator 8))
>>>> [ 3614.930318] ? __mpol_equal (mm/mempolicy.c:1304)
>>>> [ 3614.930877] ? trace_hardirqs_on (kernel/locking/lockdep.c:2630)
>>>> [ 3614.931485] ? syscall_trace_enter_phase2 (arch/x86/kernel/ptrace.c:1592)
>>>> [ 3614.932184] SyS_mbind (mm/mempolicy.c:1301)
>> That looks like trinity-c7 is waiting ot in too, but later on (after some more
>> listings like this for trinity-c7, probably threads?) we have:
>>
> 
> It keeps changing constantly, even in this trace the process is blocking on the mutex

I think it's multiple threads of process with same name trinity-c7, and the
thread 16935 of trinity-c7 does have the mutex locked and is waiting on
something else.

> rather than doing something useful, and in the next trace it's a different process.

And the next trace is from the same run, just later, i.e. it doesn't hang
completely, but makes too slow progress so that 20 minutes hang timer catches
this? I'm not sure here.

If it's too slow, I can imagine it could be simply optimized - if one thread
manages to lock the mutex, it can tell all threads waiting *at that moment* that
they can just return when the first thread is done - it has done the necessary
work for all of them already. But I wonder if this contention happens in
practice. And that certainly doesn't explain any regression that apparently occured.

> 
> Thanks,
> Sasha
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/