lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54B95E41.5010305@suse.cz>
Date:	Fri, 16 Jan 2015 19:53:53 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	"Michael S. Tsirkin" <mst@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>
CC:	linux-kernel@...r.kernel.org, Johannes Weiner <hannes@...xchg.org>,
	Vladimir Davydov <vdavydov@...allels.com>,
	Rik van Riel <riel@...hat.com>,
	Michal Hocko <mhocko@...e.cz>, Mel Gorman <mgorman@...e.de>,
	Suleiman Souhlal <suleiman@...gle.com>, linux-mm@...ck.org
Subject: Re: [PATCH] mm/vmscan: fix highidx argument type

On 01/16/2015 08:07 AM, Michael S. Tsirkin wrote:
> On Thu, Jan 15, 2015 at 02:49:20PM -0800, Andrew Morton wrote:
>> On Fri, 16 Jan 2015 00:18:12 +0200 "Michael S. Tsirkin" <mst@...hat.com> wrote:
>> 
>> > for_each_zone_zonelist_nodemask wants an enum zone_type
>> > argument, but is passed gfp_t:
>> > 
>> > mm/vmscan.c:2658:9:    expected int enum zone_type [signed] highest_zoneidx
>> > mm/vmscan.c:2658:9:    got restricted gfp_t [usertype] gfp_mask
>> > mm/vmscan.c:2658:9: warning: incorrect type in argument 2 (different base types)
>> > mm/vmscan.c:2658:9:    expected int enum zone_type [signed] highest_zoneidx
>> > mm/vmscan.c:2658:9:    got restricted gfp_t [usertype] gfp_mask
>> 
>> Which tool emitted these warnings?
> 
> Oh, sorry.
> It's sparce.
> 
>> > convert argument to the correct type.
>> > 
>> > ...
>> >
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -2656,7 +2656,7 @@ static bool throttle_direct_reclaim(gfp_t gfp_mask, struct zonelist *zonelist,
>> >  	 * should make reasonable progress.
>> >  	 */
>> >  	for_each_zone_zonelist_nodemask(zone, z, zonelist,
>> > -					gfp_mask, nodemask) {
>> > +					gfp_zone(gfp_mask), nodemask) {
>> >  		if (zone_idx(zone) > ZONE_NORMAL)
>> >  			continue;
>> 
>> hm, I wonder what the runtime effects are.

So this was introduced by 675becce15f "mm: vmscan: do not throttle based on
pfmemalloc reserves if node has no ZONE_NORMAL" in 3.15. AFAICS gfp_mask >=
gfp_zone(gfp_mask), so the high_zoneidx will be higher than it should, and
next_zones_zonelist() won't filter the higher-than-wanted zones as it should.

I guess the runtime effects is that allocations for zone_type < NORMAL, i.e.
DMA32 or DMA, can now wrongly choose a numa node without such zones, for
checking pfmemalloc reserves and throttling. Which means the throttling can be
ineffective, or it could also throttle without actually needing to, if the wrong
zone has lower reserves? Mel?

>> The throttle_direct_reclaim() comment isn't really accurate, is it? 
>> "Throttle direct reclaimers if backing storage is backed by the
>> network".  The code is applicable to all types of backing, but was
>> added to address problems which are mainly observed with network
>> backing?

I guess. I also don't see any code restricting this just for network.

> 
> 
> As far as I can tell, yes. It would seem that it can cause
> deadlocks in theory.  Cc stable on the grounds that it's obvious?

I don't think this mistake can introduce deadlocks on its own, but it also won't
prevent any problems that the throttling was suppsoed to prevent.
I agree it should go stable.

BTW, I wonder if the whole code couldn't be much simpler by capping high_zoneidx
by ZONE_NORMAL before traversing the zonelist, like this:

int high_zoneidx = min(gfp_zone(gfp_mask), ZONE_NORMAL);

first_zones_zonelist(zonelist, high_zoneidx, NULL, &zone);
pgdat = zone->zone_pgdat;

if (!pgdat || pfmemalloc_watermark_ok(pgdat))
	goto out;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ