lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 Nov 2021 11:24:02 +0800
From:   Hao Lee <haolee.swjtu@...il.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     Linux MM <linux-mm@...ck.org>,
        Johannes Weiner <hannes@...xchg.org>, vdavydov.dev@...il.com,
        Shakeel Butt <shakeelb@...gle.com>, cgroups@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: reduce spinlock contention in release_pages()

On Thu, Nov 25, 2021 at 12:31 AM Michal Hocko <mhocko@...e.com> wrote:
>
> On Wed 24-11-21 15:19:15, Hao Lee wrote:
> > When several tasks are terminated simultaneously, lots of pages will be
> > released, which can cause severe spinlock contention. Other tasks which
> > are running on the same core will be seriously affected. We can yield
> > cpu to fix this problem.
>
> How does this actually address the problem? You are effectivelly losing
> fairness completely.

Got it. Thanks!

> We do batch currently so no single task should be
> able to monopolize the cpu for too long. Why this is not sufficient?

uncharge and unref indeed take advantage of the batch process, but
del_from_lru needs more time to complete. Several tasks will contend
spinlock in the loop if nr is very large. We can notice a transient peak
of sys% reflecting this, and perf can also report spinlock slowpath takes
too much time. This scenario is not rare, especially when containers are
destroyed simultaneously and other latency critical tasks may be affected
by this problem. So I want to figure out a way to deal with it.

Thanks.

>
> > diff --git a/mm/swap.c b/mm/swap.c
> > index e8c9dc6d0377..91850d51a5a5 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -960,8 +960,14 @@ void release_pages(struct page **pages, int nr)
> >               if (PageLRU(page)) {
> >                       struct lruvec *prev_lruvec = lruvec;
> >
> > -                     lruvec = folio_lruvec_relock_irqsave(folio, lruvec,
> > +retry:
> > +                     lruvec = folio_lruvec_tryrelock_irqsave(folio, lruvec,
> >                                                                       &flags);
> > +                     if (!lruvec) {
> > +                             cond_resched();
> > +                             goto retry;
> > +                     }
> > +
> >                       if (prev_lruvec != lruvec)
> >                               lock_batch = 0;
> >
> > --
> > 2.31.1
>
> --
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ