linux-kernel - Re: dm bufio: Reduce dm_bufio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LRH.2.02.1806220845190.8072@file01.intranet.prod.int.rdu2.redhat.com>
Date:   Fri, 22 Jun 2018 08:52:09 -0400 (EDT)
From:   Mikulas Patocka <mpatocka@...hat.com>
To:     Michal Hocko <mhocko@...nel.org>
cc:     jing xia <jing.xia.mail@...il.com>,
        Mike Snitzer <snitzer@...hat.com>, agk@...hat.com,
        dm-devel@...hat.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: dm bufio: Reduce dm_bufio_lock contention



On Fri, 22 Jun 2018, Michal Hocko wrote:

> On Fri 22-06-18 11:01:51, Michal Hocko wrote:
> > On Thu 21-06-18 21:17:24, Mikulas Patocka wrote:
> [...]
> > > What about this patch? If __GFP_NORETRY and __GFP_FS is not set (i.e. the 
> > > request comes from a block device driver or a filesystem), we should not 
> > > sleep.
> > 
> > Why? How are you going to audit all the callers that the behavior makes
> > sense and moreover how are you going to ensure that future usage will
> > still make sense. The more subtle side effects gfp flags have the harder
> > they are to maintain.
> 
> So just as an excercise. Try to explain the above semantic to users. We
> currently have the following.
> 
>  * __GFP_NORETRY: The VM implementation will try only very lightweight
>  *   memory direct reclaim to get some memory under memory pressure (thus
>  *   it can sleep). It will avoid disruptive actions like OOM killer. The
>  *   caller must handle the failure which is quite likely to happen under
>  *   heavy memory pressure. The flag is suitable when failure can easily be
>  *   handled at small cost, such as reduced throughput
> 
>  * __GFP_FS can call down to the low-level FS. Clearing the flag avoids the
>  *   allocator recursing into the filesystem which might already be holding
>  *   locks.
> 
> So how are you going to explain gfp & (__GFP_NORETRY | ~__GFP_FS)? What
> is the actual semantic without explaining the whole reclaim or force
> users to look into the code to understand that? What about GFP_NOIO |
> __GFP_NORETRY? What does it mean to that "should not sleep". Do all
> shrinkers have to follow that as well?

My reasoning was that there is broken code that uses __GFP_NORETRY and 
assumes that it can't fail - so conditioning the change on !__GFP_FS would 
minimize the diruption to the broken code.

Anyway - if you want to test only on __GFP_NORETRY (and fix those 16 
broken cases that assume that __GFP_NORETRY can't fail), I'm OK with that.

Mikulas


Index: linux-2.6/mm/vmscan.c
===================================================================
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -2674,6 +2674,7 @@ static bool shrink_node(pg_data_t *pgdat
 		 * the LRU too quickly.
 		 */
 		if (!sc->hibernation_mode && !current_is_kswapd() &&
+		   !(sc->gfp_mask & __GFP_NORETRY) &&
 		   current_may_throttle() && pgdat_memcg_congested(pgdat, root))
 			wait_iff_congested(BLK_RW_ASYNC, HZ/10);