lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 23 May 2011 18:42:25 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Minchan Kim <minchan.kim@...il.com>
Cc:	Andrew Lutomirski <luto@....edu>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	fengguang.wu@...el.com, andi@...stfloor.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, mgorman@...e.de, hannes@...xchg.org,
	riel@...hat.com
Subject: Re: Kernel falls apart under light memory pressure (i.e. linking
 vmlinux)

On Mon, May 23, 2011 at 08:12:50AM +0900, Minchan Kim wrote:
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 292582c..1663d24 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -231,8 +231,11 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>        if (scanned == 0)
>                scanned = SWAP_CLUSTER_MAX;
> 
> -       if (!down_read_trylock(&shrinker_rwsem))
> -               return 1;       /* Assume we'll be able to shrink next time */
> +       if (!down_read_trylock(&shrinker_rwsem)) {
> +               /* Assume we'll be able to shrink next time */
> +               ret = 1;
> +               goto out;
> +       }

It looks cleaner to return -1 here to differentiate the failure in
taking the lock from when we take the lock and just 1 object is
freed. Callers seems to be ok with -1 already and more intuitive for
the while (nr > 10) loops too (those loops could be changed to "while
(nr > 0)" if all shrinkers are accurate and not doing something
inaccurate like the above code did, the shrinkers retvals I didn't
check yet).

>        up_read(&shrinker_rwsem);
> +out:
> +       cond_resched();
>        return ret;
>  }

If we enter the loop some of the shrinkers will reschedule but it
looks good for the last iteration that may have still run for some
time before returning. The actual failure of shrinker_rwsem seems only
theoretical though (but ok to cover it too with the cond_resched, but
in practice this should be more for the case where shrinker_rwsem
doesn't fail).

> @@ -2331,7 +2336,7 @@ static bool sleeping_prematurely(pg_data_t
> *pgdat, int order, long remaining,
>         * must be balanced
>         */
>        if (order)
> -               return pgdat_balanced(pgdat, balanced, classzone_idx);
> +               return !pgdat_balanced(pgdat, balanced, classzone_idx);
>        else
>                return !all_zones_ok;
>  }

I now wonder if this is why compaction in kswapd didn't work out well
and kswapd would spin at 100% load so much when compaction was added,
plus with kswapd-compaction patch I think this code should be changed
to:

 if (!COMPACTION_BUILD && order)
  return !pgdat_balanced();
 else
  return !all_zones_ok;

(but only with kswapd-compaction)

I should probably give kswapd-compaction another spin after fixing
this, because with compaction kswapd should be super successful at
satisfying zone_watermark_ok_safe(zone, _order_...) in the
sleeping_prematurely high watermark check, leading to pgdat_balanced
returning true most of the time (which would make kswapd go crazy spin
instead of stopping as it was supposed to). Mel, do you also think
it's worth another try with a fixed sleeping_prematurely like above?

Another thing, I'm not excited of the schedule_timeout(HZ/10) in
kswapd_try_to_sleep(), it seems all for the statistics.

Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ