lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Nov 2022 16:27:47 +0000
From:   Shakeel Butt <shakeelb@...gle.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Matthew Wilcox <willy@...radead.org>,
        David Hildenbrand <david@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>, Peter Xu <peterx@...hat.com>,
        Yang Shi <shy828301@...il.com>,
        John Hubbard <jhubbard@...dia.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Sidhartha Kumar <sidhartha.kumar@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        Mina Almasry <almasrymina@...gle.com>,
        James Houghton <jthoughton@...gle.com>,
        "Zach O'Keefe" <zokeefe@...gle.com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH 0/3] mm,thp,rmap: rework the use of subpages_mapcount

On Mon, Nov 21, 2022 at 09:16:58AM -0800, Linus Torvalds wrote:
> On Mon, Nov 21, 2022 at 8:59 AM Shakeel Butt <shakeelb@...gle.com> wrote:
> >
> > Is there a plan to remove lock_page_memcg() altogether which I missed? I
> > am planning to make lock_page_memcg() a nop for cgroup-v2 (as it shows
> > up in the perf profile on exit path)
> 
> Yay. It seems I'm not the only one hating it.
> 
> > but if we are removing it then I should just wait.
> 
> Well, I think Johannes was saying that at least the case I disliked
> (the rmap removal from the page table tear-down - I strongly suspect
> it's the one you're seeing on your perf profile too)

Yes indeed that is the one.

-   99.89%     0.00%  fork-large-mmap  [kernel.kallsyms]  [k] entry_SYSCALL_64_after_hw◆
     entry_SYSCALL_64_after_hwframe                               
   - do_syscall_64                                                
      - 48.94% __x64_sys_exit_group                               
           do_group_exit                                          
         - do_exit                                                
            - 48.94% exit_mm                                      
                 mmput                                            
               - __mmput                                          
                  - exit_mmap                                     
                     - 48.61% unmap_vmas                          
                        - 48.61% unmap_single_vma                 
                           - unmap_page_range                     
                              - 48.60% zap_p4d_range              
                                 - 44.66% zap_pte_range           
                                    + 12.61% tlb_flush_mmu        
                                    - 9.38% page_remove_rmap      
                                         2.50% lock_page_memcg    
                                         2.37% unlock_page_memcg  
                                         0.61% PageHuge           
                                      4.80% vm_normal_page        
                                      2.56% __tlb_remove_page_size
                                      0.85% lock_page_memcg       
                                      0.53% PageHuge              
                                   2.22% __tlb_remove_page_size   
                                   0.93% vm_normal_page           
                                   0.72% page_remove_rmap

> can be removed
> entirely as long as it's done under the page table lock (which my
> final version of the rmap delaying still was).
> 
> See
> 
>     https://lore.kernel.org/all/Y2llcRiDLHc2kg%2FN@cmpxchg.org/
> 
> for his preliminary patch.
> 
> That said, if you have some patch to make it a no-op for _other_
> reasons, and could be done away with _entirely_ (not just for rmap),
> then that would be even better.

I am actually looking at deprecating the whole "move charge"
funcitonality of cgroup-v1 i.e. the underlying reason lock_page_memcg
exists. That already does not work for couple of cases like partially
mapped THP and madv_free'd pages. Though that deprecation process would
take some time. In the meantime I was looking at if we can make these
functions nop for cgroup-v2.

thanks,
Shakeel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ