linux-kernel - Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0c47a3a0-5530-b257-1c1f-28ed44ba97e6@suse.cz>
Date:	Thu, 2 Jun 2016 15:24:05 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Ebru Akagunduz <ebru.akagunduz@...il.com>
Cc:	Michal Hocko <mhocko@...nel.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Stephen Rothwell <sfr@...b.auug.org.au>, linux-mm@...ck.org,
	linux-next@...r.kernel.org, linux-kernel@...r.kernel.org,
	Rik van Riel <riel@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>
Subject: Re: [linux-next: Tree for Jun 1] __khugepaged_exit
 rwsem_down_write_failed lockup

[+CC's]

On 06/02/2016 03:48 AM, Sergey Senozhatsky wrote:
> On (06/01/16 13:11), Stephen Rothwell wrote:
>> Hi all,
>>
>> Changes since 20160531:
>>
>> My fixes tree contains:
>>
>>   of: silence warnings due to max() usage
>>
>> The arm tree gained a conflict against Linus' tree.
>>
>> Non-merge commits (relative to Linus' tree): 1100
>>  936 files changed, 38159 insertions(+), 17475 deletions(-)
>
> Hello,
>
> the cc1 process ended up in DN state during kernel -j4 compilation.
>
> ...
> [ 2856.323052] INFO: task cc1:4582 blocked for more than 21 seconds.
> [ 2856.323055]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> [ 2856.323056] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2856.323059] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> [ 2856.323062]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> [ 2856.323065]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> [ 2856.323067]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> [ 2856.323068] Call Trace:
> [ 2856.323074]  [<ffffffff81441e33>] schedule+0x83/0x98
> [ 2856.323077]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> [ 2856.323080]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> [ 2856.323083]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> [ 2856.323084]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> [ 2856.323086]  [<ffffffff81443630>] down_write+0x1f/0x2e
> [ 2856.323089]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> [ 2856.323091]  [<ffffffff8103702a>] mmput+0x29/0xc5
> [ 2856.323093]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> [ 2856.323095]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> [ 2856.323097]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> [ 2856.323099]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> [ 2856.323101]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f
>
> [ 2877.322853] INFO: task cc1:4582 blocked for more than 21 seconds.
> [ 2877.322858]       Not tainted 4.7.0-rc1-next-20160601-dbg-00012-g52c180e-dirty #453
> [ 2877.322858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2877.322861] cc1             D ffff880057e9fd78     0  4582   4575 0x00000000
> [ 2877.322865]  ffff880057e9fd78 ffff880057e08000 ffff880057e9fd90 ffff880057ea0000
> [ 2877.322867]  ffff88005dc3dc68 ffffffff00000001 ffff880057e09500 ffff88005dc3dc80
> [ 2877.322867]  ffff880057e9fd90 ffffffff81441e33 ffff88005dc3dc68 ffff880057e9fe00
> [ 2877.322870] Call Trace:
> [ 2877.322875]  [<ffffffff81441e33>] schedule+0x83/0x98
> [ 2877.322878]  [<ffffffff81443d9b>] rwsem_down_write_failed+0x18e/0x1d3
> [ 2877.322881]  [<ffffffff810a87cf>] ? unlock_page+0x2b/0x2d
> [ 2877.322884]  [<ffffffff811bdb77>] call_rwsem_down_write_failed+0x17/0x30
> [ 2877.322885]  [<ffffffff811bdb77>] ? call_rwsem_down_write_failed+0x17/0x30
> [ 2877.322887]  [<ffffffff81443630>] down_write+0x1f/0x2e
> [ 2877.322890]  [<ffffffff810ea4f3>] __khugepaged_exit+0x104/0x11a
> [ 2877.322892]  [<ffffffff8103702a>] mmput+0x29/0xc5
> [ 2877.322894]  [<ffffffff8103bbd8>] do_exit+0x34c/0x894
> [ 2877.322896]  [<ffffffff8102f9e0>] ? __do_page_fault+0x2f7/0x399
> [ 2877.322898]  [<ffffffff8103c188>] do_group_exit+0x3c/0x98
> [ 2877.322900]  [<ffffffff8103c1f3>] SyS_exit_group+0xf/0xf
> [ 2877.322902]  [<ffffffff81444cdb>] entry_SYSCALL_64_fastpath+0x13/0x8f

I think it's this patch:

http://ozlabs.org/~akpm/mmots/broken-out/mm-thp-make-swapin-readahead-under-down_read-of-mmap_sem.patch

Some parts of the code in collapse_huge_page() that were under 
down_write(mmap_sem) are under down_read() after the patch. But there's 
"goto out" which continues via "goto out_up_write" which does 
up_write(mmap_sem) so there's an imbalance. One path seems to go via 
both up_read() and up_write(). I can imagine this can cause a stuck 
down_write() among other things?