lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 10 Mar 2022 17:09:22 -0700
From:   Yu Zhao <yuzhao@...gle.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Minchan Kim <minchan@...nel.org>,
        Ivan Teterevkov <ivan.teterevkov@...anix.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux-MM <linux-mm@...ck.org>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        linux-api@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>,
        Tim Murray <timmurray@...gle.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Suren Baghdasaryan <surenb@...gle.com>, dancol@...gle.com,
        Shakeel Butt <shakeelb@...gle.com>, sonnyrao@...gle.com,
        oleksandr@...hat.com, Hillf Danton <hdanton@...a.com>,
        Benoit Lize <lizeb@...gle.com>,
        Dave Hansen <dave.hansen@...el.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>
Subject: Re: Regression of madvise(MADV_COLD) on shmem?

On Thu, Mar 10, 2022 at 2:01 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 07-03-22 13:10:08, Michal Hocko wrote:
> > On Sat 05-03-22 02:17:37, Yu Zhao wrote:
> > [...]
> > > diff --git a/mm/swap.c b/mm/swap.c
> > > index bcf3ac288b56..7fd99f037ca7 100644
> > > --- a/mm/swap.c
> > > +++ b/mm/swap.c
> > > @@ -563,7 +559,7 @@ static void lru_deactivate_file_fn(struct page
> > > *page, struct lruvec *lruvec)
> > >
> > >  static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec)
> > >  {
> > > -       if (PageActive(page) && !PageUnevictable(page)) {
> > > +       if (!PageUnevictable(page)) {
> > >                 int nr_pages = thp_nr_pages(page);
> > >
> > >                 del_page_from_lru_list(page, lruvec);
> > > @@ -677,7 +673,7 @@ void deactivate_file_page(struct page *page)
> > >   */
> > >  void deactivate_page(struct page *page)
> > >  {
> > > -       if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
> > > +       if (PageLRU(page) && !PageUnevictable(page)) {
> > >                 struct pagevec *pvec;
> > >
> > >                 local_lock(&lru_pvecs.lock);
> > >
> > > I'll leave it to Minchan to decide whether this is worth fixing,
> > > together with this one:
> >
> > There doesn't seem to be any dependency on the PageActive anymore. I do
> > remember we have relied on the PageActive to move from the active list
> > to the inactive. This is not the case anymore but I am wondering whether
> > above is really sufficient. If you are deactivating an inactive page
> > then I would expect you want to move that page in the LRU as well. In
> > other words don't you want
> >       if (page_active)
> >               add_page_to_lru_list
> >       else
> >               add_page_to_lru_list_tail

Yes, this is better.

> Do you plan to send an official patch?

One thing I still haven't thought through is why the A-bit couldn't
protect the blob in the test. In theory it should be enough even
though deactivate_page() is a NOP.

1. all pages are initially inactive and have the A-bit set
2. madvise(COLD) clears the A-bit for zero-filled pages (but fails to
change their LRU positions)
3. the memcg hits the limit
4. pages in the blob are moved to the active LRU because those pages
still have the A-bit (zero-filled pages remain inactive)
5. inactive_is_low() tests true and the blob gets deactivated???

The last step doesn't make sense, since the inactive list is still very large.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ