linux-ext4 - [Bug 201631] WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bug-201631-13602-Cvx1eJA3Np@https.bugzilla.kernel.org/>
Date:   Thu, 24 Jan 2019 08:15:52 +0000
From:   bugzilla-daemon@...zilla.kernel.org
To:     linux-ext4@...r.kernel.org
Subject: [Bug 201631] WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927
 .ext4_set_page_dirty+0x70/0xb0

https://bugzilla.kernel.org/show_bug.cgi?id=201631

--- Comment #51 from Jan Kara (jack@...e.cz) ---
(In reply to Aneesh Kumar KV from comment #50)
> (In reply to Jan Kara from comment #47)
> > OK, so it seems to be more and more clear that PPC indeed has some race in
> > page table updates. What I can see in the latest report is:
> > 
> > Clean page (index 92, ino 681741, i_size 828368, flags 7fff0000002016,
> > mapcount 1) with dirty PTE (pte_val c0000005f7fae186) on unmap! Vma flags
> > fb, pgoff 0, file ino 681741
> > ...
> > page 92: b_state 21, b_blocknr 2801084, b_mapped 1452389112002, b_mapped2
> 0,
> > b_cleaned 1452396217779, now 1452400395514
> > 
> > So "Vma flags fb" shows its a normal shared, writeable file mapping. Page
> is
> > somewhere in the middle of the file (file size is 828368, page is at offset
> > 376832). The page has been writeably mapped 11ms ago (you are using ext2
> > filesystem which was confusing my previous debug attempts so only this one
> > has shown proper times) and written back 4ms ago (which should have
> > writeprotected the pte) but we still have writeable pte now on which the
> > assertion hits. So either page_mkclean() failed to clear the PTE or someone
> > created new writeable PTE without telling ext4.
> > 
> > I'll attach a new version of debug patch to distinguish these two cases.
> 
> The fact that we did try to write out the page at (bh_cleaned
> 1452396217779)implies we should have cleared the _PAGE_WRITE bit right
> (clear_page_dirty_for_io())? 

Yes, clear_page_dirty_for_io() calls page_mkclean() which clears _PAGE_WRITE
bit. So at b_cleaned time there should be no writeable PTE.

> So we should either find that bit cleared in
> pte (if we missed a related tlb flush and tlb still has that pte with
> _PAGE_WRITE) or we find that set. In this case, we find _PAGE_WRITE set in
> the pte during zap. Does that imply we did call finish_fault()? which should
> have ideally resulted in we calling page_mkwrite().

The race is not clear to me either but the rule is that if you are creating
writeable PTE for a page, you must call ->page_mkwrite(). And from the debug
output page_mkclean() was called and no ->page_mkwrite() after that so there
should be no writeable PTE. But somehow there is one as zapping reports so we
need to find out who and when creates it without calling ->page_mkwrite(). New
version of my debug patch should tell us a bit more.

Note that there are other places that play with PTEs other than fault - like
page migration, mremap, mprotect, etc. All these seem to properly use PTE locks
to serialize with page_mkclean() but well... reality is what it is and there
must be bug somewhere :) After all there are close to 200 calls of set_pte_at()
in the kernel...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.