lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4931B5B1.8030601@redhat.com>
Date:	Sat, 29 Nov 2008 16:35:45 -0500
From:	Rik van Riel <riel@...hat.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Subject: Re: [PATCH] vmscan: skip freeing memory from zones with lots free

Andrew Morton wrote:
> On Sat, 29 Nov 2008 13:59:21 -0500 Rik van Riel <riel@...hat.com> wrote:
> 
>> Andrew Morton wrote:
>>> On Sat, 29 Nov 2008 13:41:34 -0500 Rik van Riel <riel@...hat.com> wrote:
>>>
>>>> Andrew Morton wrote:
>>>>> On Sat, 29 Nov 2008 12:58:32 -0500 Rik van Riel <riel@...hat.com> wrote:
>>>>>
>>>>>>> Will this new patch reintroduce the problem which
>>>>>>> 26e4931632352e3c95a61edac22d12ebb72038fe fixed?
>>>> No, that problem is already taken care of by the fact that
>>>> active pages always get deactivated in the current VM,
>>>> regardless of whether or not they were referenced.
>>> err, sorry, that was the wrong commit. 
>>> 26e4931632352e3c95a61edac22d12ebb72038fe _introduced_ the problem, as
>>> predicted in the changelog.
>>>
>>> 265b2b8cac1774f5f30c88e0ab8d0bcf794ef7b3 later fixed it up.
>> The patch I sent in this thread does not do any baling out,
>> it only skips zones where the number of free pages is more
>> than 4 times zone->pages_high.
> 
> But that will have the same effect as baling out.  Moreso, in fact.

Kswapd already does the same in balance_pgdat.

Unequal pressure is sometimes desired, because allocation
pressure is not equal between zones.  Having lots of
lowmem allocations should not lead to gigabytes of swapped
out highmem.  A numactl pinned application should not cause
memory on other NUMA nodes to be swapped out.

Equal pressure between the zones makes sense when allocation
pressure is similar.

When allocation pressure is different, we have a choice
between evicting potentially useful data from memory or
applying uneven pressure on zones.

>> Equal pressure is still applied to the other zones.
>>
>> This should not be a problem since we do not enter direct
>> reclaim unless the free pages in every zone in our zonelist
>> are below zone->pages_low.
>>
>> Zone skipping is only done by tasks that have been in the
>> direct reclaim code for a long time.
> 
>>>From 265b2b8cac1774f5f30c88e0ab8d0bcf794ef7b3:
> 
>     We currently have a problem with the balancing of reclaim
>     between zones: much more reclaim happens against highmem than
>     against lowmem.
> 
> This problem will be reintroduced, will it not?

We already have that behaviour in balance_pgdat().

We do not do any reclaim on zones higher than the first
zone where the zone_watermark_ok call returns true:

            if (!zone_watermark_ok(zone, order, zone->pages_high,
                                                0, 0)) {
                      end_zone = i;
                      break;
            }

Further down in balance_pgdat(), we skip reclaiming from zones
that have way too much memory free.

          /*
           * We put equal pressure on every zone, unless one
           * zone has way too many pages free already.
           */
          if (!zone_watermark_ok(zone, order, 8*zone->pages_high,
                                                 end_zone, 0))
                   shrink_zone(priority, zone, &sc);

All my patch does is add one of these sanity checks to the
direct reclaim path.

-- 
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ