[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202202261045.FAsMZlyD-lkp@intel.com>
Date: Sat, 26 Feb 2022 10:32:55 +0800
From: kernel test robot <lkp@...el.com>
To: Shakeel Butt <shakeelb@...gle.com>,
Michal Koutný <mkoutny@...e.com>,
Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>
Cc: kbuild-all@...ts.01.org, Ivan Babrou <ivan@...udflare.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Memory Management List <linux-mm@...ck.org>,
cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
Shakeel Butt <shakeelb@...gle.com>,
Daniel Dao <dqminh@...udflare.com>, stable@...r.kernel.org
Subject: Re: [PATCH] memcg: async flush memcg stats from perf sensitive
codepaths
Hi Shakeel,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on hnaz-mm/master]
[also build test ERROR on linus/master v5.17-rc5 next-20220224]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Shakeel-Butt/memcg-async-flush-memcg-stats-from-perf-sensitive-codepaths/20220226-082444
base: https://github.com/hnaz/linux-mm master
config: um-i386_defconfig (https://download.01.org/0day-ci/archive/20220226/202202261045.FAsMZlyD-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/5dffeb24975bc4cbe99af650d833eb0183a4882f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Shakeel-Butt/memcg-async-flush-memcg-stats-from-perf-sensitive-codepaths/20220226-082444
git checkout 5dffeb24975bc4cbe99af650d833eb0183a4882f
# save the config file to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=um SUBARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@...el.com>
All errors (new ones prefixed by >>):
mm/vmscan.c: In function 'shrink_node':
>> mm/vmscan.c:3191:2: error: implicit declaration of function 'mem_cgroup_flush_stats_async'; did you mean 'mem_cgroup_flush_stats'? [-Werror=implicit-function-declaration]
3191 | mem_cgroup_flush_stats_async();
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| mem_cgroup_flush_stats
cc1: some warnings being treated as errors
--
mm/workingset.c: In function 'workingset_refault':
>> mm/workingset.c:358:2: error: implicit declaration of function 'mem_cgroup_flush_stats_async'; did you mean 'mem_cgroup_flush_stats'? [-Werror=implicit-function-declaration]
358 | mem_cgroup_flush_stats_async();
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| mem_cgroup_flush_stats
cc1: some warnings being treated as errors
vim +3191 mm/vmscan.c
3175
3176 static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
3177 {
3178 struct reclaim_state *reclaim_state = current->reclaim_state;
3179 unsigned long nr_reclaimed, nr_scanned;
3180 struct lruvec *target_lruvec;
3181 bool reclaimable = false;
3182 unsigned long file;
3183
3184 target_lruvec = mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat);
3185
3186 again:
3187 /*
3188 * Flush the memory cgroup stats, so that we read accurate per-memcg
3189 * lruvec stats for heuristics.
3190 */
> 3191 mem_cgroup_flush_stats_async();
3192
3193 memset(&sc->nr, 0, sizeof(sc->nr));
3194
3195 nr_reclaimed = sc->nr_reclaimed;
3196 nr_scanned = sc->nr_scanned;
3197
3198 /*
3199 * Determine the scan balance between anon and file LRUs.
3200 */
3201 spin_lock_irq(&target_lruvec->lru_lock);
3202 sc->anon_cost = target_lruvec->anon_cost;
3203 sc->file_cost = target_lruvec->file_cost;
3204 spin_unlock_irq(&target_lruvec->lru_lock);
3205
3206 /*
3207 * Target desirable inactive:active list ratios for the anon
3208 * and file LRU lists.
3209 */
3210 if (!sc->force_deactivate) {
3211 unsigned long refaults;
3212
3213 refaults = lruvec_page_state(target_lruvec,
3214 WORKINGSET_ACTIVATE_ANON);
3215 if (refaults != target_lruvec->refaults[0] ||
3216 inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
3217 sc->may_deactivate |= DEACTIVATE_ANON;
3218 else
3219 sc->may_deactivate &= ~DEACTIVATE_ANON;
3220
3221 /*
3222 * When refaults are being observed, it means a new
3223 * workingset is being established. Deactivate to get
3224 * rid of any stale active pages quickly.
3225 */
3226 refaults = lruvec_page_state(target_lruvec,
3227 WORKINGSET_ACTIVATE_FILE);
3228 if (refaults != target_lruvec->refaults[1] ||
3229 inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
3230 sc->may_deactivate |= DEACTIVATE_FILE;
3231 else
3232 sc->may_deactivate &= ~DEACTIVATE_FILE;
3233 } else
3234 sc->may_deactivate = DEACTIVATE_ANON | DEACTIVATE_FILE;
3235
3236 /*
3237 * If we have plenty of inactive file pages that aren't
3238 * thrashing, try to reclaim those first before touching
3239 * anonymous pages.
3240 */
3241 file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
3242 if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
3243 sc->cache_trim_mode = 1;
3244 else
3245 sc->cache_trim_mode = 0;
3246
3247 /*
3248 * Prevent the reclaimer from falling into the cache trap: as
3249 * cache pages start out inactive, every cache fault will tip
3250 * the scan balance towards the file LRU. And as the file LRU
3251 * shrinks, so does the window for rotation from references.
3252 * This means we have a runaway feedback loop where a tiny
3253 * thrashing file LRU becomes infinitely more attractive than
3254 * anon pages. Try to detect this based on file LRU size.
3255 */
3256 if (!cgroup_reclaim(sc)) {
3257 unsigned long total_high_wmark = 0;
3258 unsigned long free, anon;
3259 int z;
3260
3261 free = sum_zone_node_page_state(pgdat->node_id, NR_FREE_PAGES);
3262 file = node_page_state(pgdat, NR_ACTIVE_FILE) +
3263 node_page_state(pgdat, NR_INACTIVE_FILE);
3264
3265 for (z = 0; z < MAX_NR_ZONES; z++) {
3266 struct zone *zone = &pgdat->node_zones[z];
3267 if (!managed_zone(zone))
3268 continue;
3269
3270 total_high_wmark += high_wmark_pages(zone);
3271 }
3272
3273 /*
3274 * Consider anon: if that's low too, this isn't a
3275 * runaway file reclaim problem, but rather just
3276 * extreme pressure. Reclaim as per usual then.
3277 */
3278 anon = node_page_state(pgdat, NR_INACTIVE_ANON);
3279
3280 sc->file_is_tiny =
3281 file + free <= total_high_wmark &&
3282 !(sc->may_deactivate & DEACTIVATE_ANON) &&
3283 anon >> sc->priority;
3284 }
3285
3286 shrink_node_memcgs(pgdat, sc);
3287
3288 if (reclaim_state) {
3289 sc->nr_reclaimed += reclaim_state->reclaimed_slab;
3290 reclaim_state->reclaimed_slab = 0;
3291 }
3292
3293 /* Record the subtree's reclaim efficiency */
3294 vmpressure(sc->gfp_mask, sc->target_mem_cgroup, true,
3295 sc->nr_scanned - nr_scanned,
3296 sc->nr_reclaimed - nr_reclaimed);
3297
3298 if (sc->nr_reclaimed - nr_reclaimed)
3299 reclaimable = true;
3300
3301 if (current_is_kswapd()) {
3302 /*
3303 * If reclaim is isolating dirty pages under writeback,
3304 * it implies that the long-lived page allocation rate
3305 * is exceeding the page laundering rate. Either the
3306 * global limits are not being effective at throttling
3307 * processes due to the page distribution throughout
3308 * zones or there is heavy usage of a slow backing
3309 * device. The only option is to throttle from reclaim
3310 * context which is not ideal as there is no guarantee
3311 * the dirtying process is throttled in the same way
3312 * balance_dirty_pages() manages.
3313 *
3314 * Once a node is flagged PGDAT_WRITEBACK, kswapd will
3315 * count the number of pages under pages flagged for
3316 * immediate reclaim and stall if any are encountered
3317 * in the nr_immediate check below.
3318 */
3319 if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken)
3320 set_bit(PGDAT_WRITEBACK, &pgdat->flags);
3321
3322 /* Allow kswapd to start writing pages during reclaim.*/
3323 if (sc->nr.unqueued_dirty == sc->nr.file_taken)
3324 set_bit(PGDAT_DIRTY, &pgdat->flags);
3325
3326 /*
3327 * If kswapd scans pages marked for immediate
3328 * reclaim and under writeback (nr_immediate), it
3329 * implies that pages are cycling through the LRU
3330 * faster than they are written so forcibly stall
3331 * until some pages complete writeback.
3332 */
3333 if (sc->nr.immediate)
3334 reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK);
3335 }
3336
3337 /*
3338 * Tag a node/memcg as congested if all the dirty pages were marked
3339 * for writeback and immediate reclaim (counted in nr.congested).
3340 *
3341 * Legacy memcg will stall in page writeback so avoid forcibly
3342 * stalling in reclaim_throttle().
3343 */
3344 if ((current_is_kswapd() ||
3345 (cgroup_reclaim(sc) && writeback_throttling_sane(sc))) &&
3346 sc->nr.dirty && sc->nr.dirty == sc->nr.congested)
3347 set_bit(LRUVEC_CONGESTED, &target_lruvec->flags);
3348
3349 /*
3350 * Stall direct reclaim for IO completions if the lruvec is
3351 * node is congested. Allow kswapd to continue until it
3352 * starts encountering unqueued dirty pages or cycling through
3353 * the LRU too quickly.
3354 */
3355 if (!current_is_kswapd() && current_may_throttle() &&
3356 !sc->hibernation_mode &&
3357 test_bit(LRUVEC_CONGESTED, &target_lruvec->flags))
3358 reclaim_throttle(pgdat, VMSCAN_THROTTLE_CONGESTED);
3359
3360 if (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
3361 sc))
3362 goto again;
3363
3364 /*
3365 * Kswapd gives up on balancing particular nodes after too
3366 * many failures to reclaim anything from them and goes to
3367 * sleep. On reclaim progress, reset the failure counter. A
3368 * successful direct reclaim run will revive a dormant kswapd.
3369 */
3370 if (reclaimable)
3371 pgdat->kswapd_failures = 0;
3372 }
3373
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Powered by blists - more mailing lists