[<prev] [next>] [day] [month] [year] [list]
Message-ID: <5a046f53.195cb.14db30cc3a3.Coremail.wei_qi_k@163.com>
Date: Tue, 2 Jun 2015 14:54:27 +0800 (CST)
From: weiqi <wei_qi_k@....com>
To: linux-kernel@...r.kernel.org
Subject: PROBLEM: infinite loop do_sparc64_fault with fault_code 2
Hello,
Everyone
Nearly, I'm working on a sparc64 machine with linux-2.6.32 (32 cores, SMP) ,64bit kernel and userspace is 32bit.
when I run LTP test case with command :"./kill10 -c100 -g 1 -n
1", It will trap in an infinite page_fault loop occasionally. and
one of the kill10 process will use 100% CPU . (easy to repeat, just
run command again and again)
After some debug, I find :
1) the fault address is the same, and always at kill10's user-stack, for example "0xffb0b470".
2) the fault happend when kill10 handle signal at put_user() ,
code path: arch/sparc/kernel/signal32.c: setup_frame32() -->
put_user().
3) The first fault is handled by do_wp_page()
because of COW, and then do_wp_page() found PageAnon(old_page) then
reuse old_page.
4) then go into infinite loop fault with fault_code 2 (D-TLB
miss), and handled by handle_pte_fault() out at flush_tlb_page() which
has a comment :
/*
* This is needed only for protection faults but the arch code
* is not yet telling us if this is a protection fault or not.
* This still avoids useless tlb flushes for .text page faults
* with threads.
*/
if (flags & FAULT_FLAG_WRITE)
flush_tlb_page(vma, address);
I'v also tested with linux-3.10, and almost same result.
I know sparc has software tlb process, In the function do_wp_page(),
it will call flush_tlb_page() and update_mmu_cache() , but It seems
no effect, just D-TLB miss infinitely at same address
Powered by blists - more mailing lists