[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c2004d9c-1c57-756b-31ad-d88afd459a2b@sandisk.com>
Date: Mon, 15 Aug 2016 16:39:57 -0700
From: Bart Van Assche <bart.vanassche@...disk.com>
To: Oleg Nesterov <oleg@...hat.com>
CC: Peter Zijlstra <peterz@...radead.org>,
"mingo@...nel.org" <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
"Johannes Weiner" <hannes@...xchg.org>, Neil Brown <neilb@...e.de>,
Michael Shaver <jmshaver@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
On 08/13/2016 09:32 AM, Oleg Nesterov wrote:
> On 08/12, Bart Van Assche wrote:
>> before I started testing. It took some time
>> before I could reproduce the hang in truncate_inode_pages_range().
>
> all I can say this contradicts with the previous testing results with
> my previous patch or with your change in abort_exclusive_wait().
Hello Oleg,
My opinion is that all this means is that we do not yet have a full
understanding of what is going on.
BTW, I have improved my page lock owner instrumentation patch such that
it prints a call stack of the lock owner if lock_page() takes too long.
The following call stack was reported:
__lock_page / pid 8549 / m 0x2: timeout - continuing to wait for 8549
[<ffffffff8102b316>] save_stack_trace+0x26/0x50
[<ffffffff81152bee>] add_to_page_cache_lru+0x7e/0x170
[<ffffffff8121bfc5>] mpage_readpages+0xc5/0x170
[<ffffffff81215548>] blkdev_readpages+0x18/0x20
[<ffffffff81163a68>] __do_page_cache_readahead+0x268/0x310
[<ffffffff811640a8>] force_page_cache_readahead+0xa8/0x100
[<ffffffff81164139>] page_cache_sync_readahead+0x39/0x40
[<ffffffff81153967>] generic_file_read_iter+0x707/0x920
[<ffffffff81215920>] blkdev_read_iter+0x30/0x40
[<ffffffff811d4b4b>] __vfs_read+0xbb/0x130
[<ffffffff811d4f31>] vfs_read+0x91/0x130
[<ffffffff811d62b4>] SyS_read+0x44/0xa0
[<ffffffff816281e5>] entry_SYSCALL_64_fastpath+0x18/0xa8
My understanding of mpage_readpages() is that the page unlock happens
after readahead I/O completed (see also page_endio()). So this probably
means that an I/O request submitted because of readahead code did not
get completed. I will see whether I can find anything that's wrong in
the block layer.
Bart.
Powered by blists - more mailing lists