linux-kernel - Re: [PATCH] lock_page() doesn't lock if __wait_on_bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151214200151.GA12014@clm-mbp.thefacebook.com>
Date:	Mon, 14 Dec 2015 15:01:51 -0500
From:	Chris Mason <clm@...com>
To:	Dave Jones <dsj@...com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Jon Christopherson <jon@...s.org>, NeilBrown <neilb@...e.de>,
	Ingo Molnar <mingo@...nel.org>,
	David Howells <dhowells@...hat.com>,
	Steven Whitehouse <swhiteho@...hat.com>
Subject: Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns
 -EINTR

On Mon, Dec 14, 2015 at 01:33:56PM -0500, Dave Jones wrote:
> On Sat, Dec 12, 2015 at 07:07:46PM -0500, Chris Mason wrote:
>  > On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
>  > > On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
>  > > <torvalds@...ux-foundation.org> wrote:
>  > > >
>  > > > Peter, did that patch also handle just plain "lock_page()" case?
>  > > 
>  > > Looking more at it, I think this all goes back to commit 743162013d40
>  > > ("sched: Remove proliferation of wait_on_bit() action functions").
>  > > 
>  > > It looks like PeterZ's pending patch should fix this, by passing in
>  > > the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
>  > > back to signal_pending_state(). PeterZ, did I follow the history of
>  > > this correctly?
>  > 
>  > Looks right to me, I found Peter's patch and have it running now. After
>  > about 6 hours my patch did eventually crash again under trinity.  Btrfs has a
>  > very old (from 2011) bug in the error handling path that trinity is
>  > banging on.
> 
> Is the other bug this one ? I've hit this quite a lot over the last 12 months,
> and now that the lock_page bug is fixed this is showing up again.
> 
> page:ffffea00110d2700 count:4 mapcount:0 mapping:ffff88045b5160a0 index:0x0
> flags: 0x8000000000000806(error|referenced|private)
> page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))

[ snip ]

>  [<ffffffffc00f17f9>] prepare_uptodate_page+0x39/0x80 [btrfs]
>  [<ffffffffc00f19de>] prepare_pages+0x19e/0x210 [btrfs]

This should be the second call to prepare_uptodate_page() in
prepare_pages().   If we get an error on the first call, and the write
only spans a single page, we'll call prepare_uptodate_page a second time
on an unlocked page.

I'll send out the patch a little later this afternoon.

>  [<ffffffffc00f2d21>] __btrfs_buffered_write+0x351/0x8a0 [btrfs]
>  [<ffffffffc00f29d0>] ? btrfs_dirty_pages+0xf0/0xf0 [btrfs]
>  [<ffffffffad2619aa>] ? generic_file_direct_write+0x1aa/0x2c0
>  [<ffffffffad261800>] ? generic_file_read_iter+0xa00/0xa00
>  [<ffffffffc00f866d>] btrfs_file_write_iter+0x6dd/0x800 [btrfs]
>  [<ffffffffad2f694d>] __vfs_write+0x21d/0x260
>  [<ffffffffad2f6730>] ? __vfs_read+0x260/0x260
>  [<ffffffffad12ed32>] ? __lock_is_held+0x92/0xd0
>  [<ffffffffad0ee3b1>] ? preempt_count_sub+0xc1/0x120
>  [<ffffffffad12cd17>] ? percpu_down_read+0x57/0xa0
>  [<ffffffffad2fbd24>] ? __sb_start_write+0xb4/0xf0
>  [<ffffffffad2f7736>] vfs_write+0xf6/0x260
>  [<ffffffffad2f8d4f>] SyS_write+0xbf/0x160
>  [<ffffffffad2f8c90>] ? SyS_read+0x160/0x160
>  [<ffffffffad002017>] ? trace_hardirqs_on_thunk+0x17/0x19
>  [<ffffffffadceab17>] entry_SYSCALL_64_fastpath+0x12/0x6b

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/