[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5366A63F.9030401@nod.at>
Date: Sun, 04 May 2014 22:42:39 +0200
From: Richard Weinberger <richard@....at>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, Dave Jones <davej@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Johannes Weiner <hannes@...xchg.org>,
Sasha Levin <sasha.levin@...cle.com>,
Hugh Dickins <hughd@...gle.com>,
Toralf Förster <toralf.foerster@....de>
Subject: Re: [PATCH] mm: Fix force_flush behavior in zap_pte_range()
Am 04.05.2014 20:31, schrieb Linus Torvalds:
>> With your patch applied I see lots of BUG: Bad rss-counter state messages on UML (x86_32)
>> when fuzzing with trinity the mremap syscall.
>> And sometimes I face BUG at mm/filemap.c:202.
>
> I'm suspecting that it's some UML bug that is triggered by the
> changes. UML has its own tlb gather logic (I'm not quite sure why), I
> wonder what's up.
I cannot tell why UML has it's own tlb gather logic, I suspect nobody
cared so far to clean up the code.
That said, I've converted it today to the generic gather logic and it works.
Sadly I'm still facing the same issues (sigh!).
> Also, are the messages coming from UML or from the host kernel? I'm
> assuming they are UML.
>From UML directly.
>> After killing a trinity child I start observing the said issues.
>>
>> e.g.
>> fix_range_common: failed, killing current process: 841
>> fix_range_common: failed, killing current process: 842
>> fix_range_common: failed, killing current process: 843
>> BUG: Bad rss-counter state mm:28e69600 idx:0 val:2
>
> That "idx=0" means that it's MM_FILEPAGES. Apparently the killing
> ended up resulting in not freeing all the file mapping pte's.
>
> So I'm assuming the real issue is that fix_range_common failure that
> triggers this.
>
> Exactly why the new tlb flushing triggers this is not entirely clear,
> but I'd take a look at how UML reacts to the whole fact that a forced
> flush (which never happened before, because your __tlb_remove_page()
> doesn't batch anything up and always returns 1) updates the tlb
> start/end fields as it does the tlb_flush_mmu_tlbonly().
Thanks for the pointer, I'll dig deeper into the issue.
Thanks,
//richard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists