[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8dca5a34-2c5c-bc49-b2ad-f3e5e0fdbba3@redhat.com>
Date: Fri, 3 Sep 2021 11:31:01 +0200
From: David Hildenbrand <david@...hat.com>
To: "George G. Davis" <george_davis@...tor.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
"open list:MEMORY MANAGEMENT" <linux-mm@...ck.org>,
open list <linux-kernel@...r.kernel.org>,
Eugeniu Rosca <erosca@...adit-jv.com>,
"George G. Davis" <davis.george@...mens.com>
Subject: Re: [RFC][PATCH] mm/page_isolation: tracing: trace all
test_pages_isolated failures
On 03.09.21 00:21, George G. Davis wrote:
> On Tue, Aug 31, 2021 at 04:53:31PM +0200, David Hildenbrand wrote:
>> On 23.08.21 22:28, George G. Davis wrote:
>>> From: "George G. Davis" <davis.george@...mens.com>
>>>
>>> Some test_pages_isolated failure conditions don't include trace points.
>>> For debugging issues caused by "pinned" pages, make sure to trace all
>>> calls whether they succeed or fail. In this case, a failure case did not
>>> result in a trace point. So add the missing failure case in
>>> test_pages_isolated traces.
>>
>> In which setups did you actually run into these cases?
>
> Good question!
>
> Although I'm not 100% certain that this specific failure condition has
> occurred in my recent testing, I'm able to reproduce cma_alloc -EBUSY
> faiure conditions when testing latest/recent master on arm64 based
> Renesas R-Car Starter Kit [1] using defconfig with
> CONFIG_CMA_SIZE_MBYTES=384 while running the following test case:
Okay, I think you are not hitting the path you touched in this patch,
because I assume it will never ever really trigger ...
>
> trace-cmd record -N 192.168.1.87:12345 -b 4096 -e cma -e page_isolation -e compaction -e migrate &
> sleep 10
> while true; do a=$(( ( RANDOM % 10000 ) + 1 )); echo $a > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $a; echo $a > /sys/kernel/debug/cma/cma-reserved/free); done &
> while true; do b=$(( ( RANDOM % 10000 ) + 1 )); echo $b > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $b; echo $b > /sys/kernel/debug/cma/cma-reserved/free); done &
> while true; do c=$(( ( RANDOM % 10000 ) + 1 )); echo $c > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $c; echo $c > /sys/kernel/debug/cma/cma-reserved/free); done &
> while true; do d=$(( ( RANDOM % 10000 ) + 1 )); echo $d > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $d; echo $d > /sys/kernel/debug/cma/cma-reserved/free); done &
> while true; do e=$(( ( RANDOM % 10000 ) + 1 )); echo $e > /sys/kernel/debug/cma/cma-reserved/alloc && (usleep $e; echo $e > /sys/kernel/debug/cma/cma-reserved/free); done &
> /selftests/vm/transhuge-stress &
>
> The cma_alloc -EBUSY failures are caused by THP compound pages allocated
> from the CMA region where migration does not seem to work for compound
> THP pages. The work around is to disable CONFIG_TRANSPARENT_HUGEPAGE
> since it seems incompatible with the intended use of the CMA region.
Oh, that sounds broken, THP should not block CMA allocation or page
migration for other purposes.
a) Are these temporary or permanent allocation errors? If they are
permanent, they will also break memory unplug.
b) Did you reproduce on other architectures as well?
c) Did it use to work but is now broken? IOW, did you try bisecting?
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists