linux-kernel - Re: Deadlock possibly caused by too_many

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTin38qJ-U3B7XwMh-3aR9zRs21LgR1yHfqYifxrn@mail.gmail.com>
Date:	Tue, 19 Oct 2010 10:15:06 +0900
From:	Minchan Kim <minchan.kim@...il.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Neil Brown <neilb@...e.de>,
	Wu Fengguang <fengguang.wu@...el.com>,
	Rik van Riel <riel@...hat.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"Li, Shaohua" <shaohua.li@...el.com>
Subject: Re: Deadlock possibly caused by too_many_isolated.

On Tue, Oct 19, 2010 at 9:57 AM, KOSAKI Motohiro
<kosaki.motohiro@...fujitsu.com> wrote:
>> > I think there are two bugs here.
>> > The raid1 bug that Torsten mentions is certainly real (and has been around
>> > for an embarrassingly long time).
>> > The bug that I identified in too_many_isolated is also a real bug and can be
>> > triggered without md/raid1 in the mix.
>> > So this is not a 'full fix' for every bug in the kernel :-), but it could
>> > well be a full fix for this particular bug.
>> >
>>
>> Can we just delete the too_many_isolated() logic?  (Crappy comment
>> describes what the code does but not why it does it).
>
> if my remember is correct, we got bug report that LTP may makes misterious
> OOM killer invocation about 1-2 years ago. because, if too many parocess are in
> reclaim path, all of reclaimable pages can be isolated and last reclaimer found
> the system don't have any reclaimable pages and lead to invoke OOM killer.
> We have strong motivation to avoid false positive oom. then, some discusstion
> made this patch.
>
> if my remember is incorrect, I hope Wu or Rik fix me.

AFAIR, it's right.

How about this?

It's rather aggressive throttling than old(ie, it considers not lru
type granularity but zone )
But I think it can prevent unnecessary OOM problem and solve deadlock problem.


diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f12ad18..acd6a65 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1961,6 +1961,21 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
        return alloc_flags;
 }

+/*
+ * Are there way too many processes are reclaiming this zone?
+ */
+static int too_many_isolated_zone(struct zone *zone)
+{
+       unsigned long inactive, isolated;
+
+       inactive = zone_page_state(zone, NR_INACTIVE_FILE) +
+               zone_page_state(zone, NR_INACTIVE_ANON);
+       isolated = zone_page_state(zone, NR_ISOLATED_FILE) +
+               zone_page_state(zone, NR_ISOLATED_ANON);
+
+       return isolated > inactive;
+}
+
 static inline struct page *
 __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
        struct zonelist *zonelist, enum zone_type high_zoneidx,
@@ -2054,10 +2069,11 @@ rebalance:
                goto got_pg;

        /*
-        * If we failed to make any progress reclaiming, then we are
-        * running out of options and have to consider going OOM
+        * If we failed to make any progress reclaiming and there aren't
+        * many parallel reclaiming, then we are unning out of options and
+        * have to consider going OOM
         */
-       if (!did_some_progress) {
+       if (!did_some_progress && !too_many_isolated_zone(preferred_zone)) {
                if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) {
                        if (oom_killer_disabled)
                                goto nopage;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c5dfabf..f2109af 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1129,31 +1129,6 @@ int isolate_lru_page(struct page *page)
 }

 /*
- * Are there way too many processes in the direct reclaim path already?
- */
-static int too_many_isolated(struct zone *zone, int file,
-               struct scan_control *sc)
-{
-       unsigned long inactive, isolated;
-
-       if (current_is_kswapd())
-               return 0;
-
-       if (!scanning_global_lru(sc))
-               return 0;
-
-       if (file) {
-               inactive = zone_page_state(zone, NR_INACTIVE_FILE);
-               isolated = zone_page_state(zone, NR_ISOLATED_FILE);
-       } else {
-               inactive = zone_page_state(zone, NR_INACTIVE_ANON);
-               isolated = zone_page_state(zone, NR_ISOLATED_ANON);
-       }
-
-       return isolated > inactive;
-}
-
-/*
  * TODO: Try merging with migrations version of putback_lru_pages
  */
 static noinline_for_stack void
@@ -1290,15 +1265,6 @@ shrink_inactive_list(unsigned long nr_to_scan,
struct zone *zone,
        unsigned long nr_anon;
        unsigned long nr_file;

-       while (unlikely(too_many_isolated(zone, file, sc))) {
-               congestion_wait(BLK_RW_ASYNC, HZ/10);
-
-               /* We are about to die and free our memory. Return now. */
-               if (fatal_signal_pending(current))
-                       return SWAP_CLUSTER_MAX;
-       }
-
-
        lru_add_drain();
        spin_lock_irq(&zone->lru_lock);




-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/