linux-kernel - Re: [PATCH] mm: Fix force_flush behavior in zap_pte

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFw9SLeE1fv1-nKMeB7o0YAFZ85mskYy_izCb7Nh3AiicQ@mail.gmail.com>
Date:	Sun, 4 May 2014 11:31:35 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Richard Weinberger <richard@....at>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>, Dave Jones <davej@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Sasha Levin <sasha.levin@...cle.com>,
	Hugh Dickins <hughd@...gle.com>,
	Toralf Förster <toralf.foerster@....de>
Subject: Re: [PATCH] mm: Fix force_flush behavior in zap_pte_range()

On Sun, May 4, 2014 at 1:34 AM, Richard Weinberger <richard@....at> wrote:
>
> Hmm, I got confused by:
>                         if (PageAnon(page))
>                                 rss[MM_ANONPAGES]--;
>                         else {
>                                 if (pte_dirty(ptent)) {
>                                         force_flush = 1;
>
> Here you set force_flush.

Yes. And it needs to stay set, but we don't want to break out early.

The logic is:

 - if the tlb removal page batching tables fill up, we need to stop
any further batching, and flush the TLB immediately, since we don't
have room for any more entries.

   Thus that case does "force_flush=1" _and_ a "break" out of the loop.

 - if we see dirty shared pages, we need to flush the TLB before we
release the page table lock, but we don't have to stop further
batching.

   So this case just does "force_flush=1", but will continue to loop
over the page tables, since it can happily batch more pages.

>                         if (unlikely(!__tlb_remove_page(tlb, page))) {
>                                 force_flush = 1;
>                                 break;
>                         }
>
> And here it cannot get back to 0.

Correct. It *must* not go back to zero, because that would break the
"we had dirty pages, and more room to batch things".

> With your patch applied I see lots of BUG: Bad rss-counter state messages on UML (x86_32)
> when fuzzing with trinity the mremap syscall.
> And sometimes I face BUG at mm/filemap.c:202.

I'm suspecting that it's some UML bug that is triggered by the
changes. UML has its own tlb gather logic (I'm not quite sure why), I
wonder what's up.

Also, are the messages coming from UML or from the host kernel? I'm
assuming they are UML.

> After killing a trinity child I start observing the said issues.
>
> e.g.
> fix_range_common: failed, killing current process: 841
> fix_range_common: failed, killing current process: 842
> fix_range_common: failed, killing current process: 843
> BUG: Bad rss-counter state mm:28e69600 idx:0 val:2

That "idx=0" means that it's MM_FILEPAGES. Apparently the killing
ended up resulting in not freeing all the file mapping pte's.

So I'm assuming the real issue is that fix_range_common failure that
triggers this.

Exactly why the new tlb flushing triggers this is not entirely clear,
but I'd take a look at how UML reacts to the whole fact that a forced
flush (which never happened before, because your __tlb_remove_page()
doesn't batch anything up and always returns 1) updates the tlb
start/end fields as it does the tlb_flush_mmu_tlbonly().

             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/