[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufbt6i2-Z9=+Ngjnhnk8nh8-yYkhpPBi0i_ca8xTsk9mVw@mail.gmail.com>
Date: Thu, 22 Apr 2021 14:15:27 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Shakeel Butt <shakeelb@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Xing Zhengjun <zhengjun.xing@...ux.intel.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Huang Ying <ying.huang@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Michal Hocko <mhocko@...e.com>, wfg@...l.ustc.edu.cn
Subject: Re: [RFC] mm/vmscan.c: avoid possible long latency caused by too_many_isolated()
On Thu, Apr 22, 2021 at 12:52 PM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> On Thu, Apr 22, 2021 at 10:13 AM Yu Zhao <yuzhao@...gle.com> wrote:
> >
> [...]
> > spin_lock_irq(&lruvec->lru_lock);
> > @@ -3302,6 +3252,7 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
> > unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > gfp_t gfp_mask, nodemask_t *nodemask)
> > {
> > + int nr_cpus;
> > unsigned long nr_reclaimed;
> > struct scan_control sc = {
> > .nr_to_reclaim = SWAP_CLUSTER_MAX,
> > @@ -3334,8 +3285,17 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> > set_task_reclaim_state(current, &sc.reclaim_state);
> > trace_mm_vmscan_direct_reclaim_begin(order, sc.gfp_mask);
> >
> > + nr_cpus = current_is_kswapd() ? 0 : num_online_cpus();
>
> kswapd does not call this function (directly or indirectly).
>
> > + while (nr_cpus && !atomic_add_unless(&pgdat->nr_reclaimers, 1, nr_cpus)) {
>
> At most nr_nodes * nr_cpus direct reclaimers are allowed?
>
> > + if (schedule_timeout_killable(HZ / 10))
>
> trace_mm_vmscan_direct_reclaim_end() and set_task_reclaim_state(NULL)?
>
> > + return SWAP_CLUSTER_MAX;
> > + }
> > +
> > nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
> >
> > + if (nr_cpus)
> > + atomic_dec(&pgdat->nr_reclaimers);
> > +
> > trace_mm_vmscan_direct_reclaim_end(nr_reclaimed);
> > set_task_reclaim_state(current, NULL);
>
> BTW I think this approach needs to be more sophisticated. What if a
> direct reclaimer within the reclaim is scheduled away and is out of
> CPU quota?
More sophisticated to what end?
We wouldn't worry about similar scenarios that we ran out of cpu quota
while holding resources like a mutex, Si why this one is different,
especially given that we already allow many reclaimers to run
concurrently?
Powered by blists - more mailing lists