lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <5a046f53.195cb.14db30cc3a3.Coremail.wei_qi_k@163.com>
Date:	Tue, 2 Jun 2015 14:54:27 +0800 (CST)
From:	weiqi <wei_qi_k@....com>
To:	linux-kernel@...r.kernel.org
Subject: PROBLEM: infinite loop do_sparc64_fault with fault_code 2


Hello,
   Everyone
       Nearly, I'm working on a sparc64 machine with linux-2.6.32 (32 cores, SMP) ,64bit kernel and userspace is 32bit.
 
      when I run LTP test case with command :"./kill10 -c100 -g 1 -n 
1",  It will trap in  an infinite page_fault   loop  occasionally.  and 
 one of the kill10 process will  use 100% CPU . (easy to repeat, just 
run command again and again)

       After some debug, I find :

      1) the fault address is the same, and always at kill10's user-stack, for example "0xffb0b470".

  
 
    2) the fault  happend when kill10 handle signal at  put_user()  , 
code path: arch/sparc/kernel/signal32.c: setup_frame32()  --> 
put_user().

      3) The first  fault is handled by do_wp_page() 
because of COW,  and then do_wp_page() found PageAnon(old_page)  then 
reuse old_page.

   
   4) then go into  infinite loop  fault  with fault_code 2 (D-TLB 
miss), and  handled by handle_pte_fault() out at flush_tlb_page()  which
 has a comment :
                /*
                 * This is needed only for protection faults but the arch code
                 * is not yet telling us if this is a protection fault or not.
                 * This still avoids useless tlb flushes for .text page faults
                 * with threads.
                 */
                   if (flags & FAULT_FLAG_WRITE)
                        flush_tlb_page(vma, address);

     I'v also tested  with linux-3.10,  and almost same result.
  
   I know sparc has software tlb process,  In the function do_wp_page(),
 it will call  flush_tlb_page() and update_mmu_cache() , but It seems  
no effect, just   D-TLB miss  infinitely at same address

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ