[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc349738-544c-34a1-748f-4e1a2c595a20@roeck-us.net>
Date: Fri, 2 Jul 2021 13:28:18 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Dennis Zhou <dennis@...nel.org>
Cc: Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
Roman Gushchin <guro@...com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] percpu: implement partial chunk depopulation
On 7/2/21 12:45 PM, Dennis Zhou wrote:
> Hello,
>
> On Fri, Jul 02, 2021 at 12:11:40PM -0700, Guenter Roeck wrote:
>> Hi,
>>
>> On Mon, Apr 19, 2021 at 10:50:46PM +0000, Dennis Zhou wrote:
>>> From: Roman Gushchin <guro@...com>
>>>
>>> This patch implements partial depopulation of percpu chunks.
>>>
>>> As of now, a chunk can be depopulated only as a part of the final
>>> destruction, if there are no more outstanding allocations. However
>>> to minimize a memory waste it might be useful to depopulate a
>>> partially filed chunk, if a small number of outstanding allocations
>>> prevents the chunk from being fully reclaimed.
>>>
>>> This patch implements the following depopulation process: it scans
>>> over the chunk pages, looks for a range of empty and populated pages
>>> and performs the depopulation. To avoid races with new allocations,
>>> the chunk is previously isolated. After the depopulation the chunk is
>>> sidelined to a special list or freed. New allocations prefer using
>>> active chunks to sidelined chunks. If a sidelined chunk is used, it is
>>> reintegrated to the active lists.
>>>
>>> The depopulation is scheduled on the free path if the chunk is all of
>>> the following:
>>> 1) has more than 1/4 of total pages free and populated
>>> 2) the system has enough free percpu pages aside of this chunk
>>> 3) isn't the reserved chunk
>>> 4) isn't the first chunk
>>> If it's already depopulated but got free populated pages, it's a good
>>> target too. The chunk is moved to a special slot,
>>> pcpu_to_depopulate_slot, chunk->isolated is set, and the balance work
>>> item is scheduled. On isolation, these pages are removed from the
>>> pcpu_nr_empty_pop_pages. It is constantly replaced to the
>>> to_depopulate_slot when it meets these qualifications.
>>>
>>> pcpu_reclaim_populated() iterates over the to_depopulate_slot until it
>>> becomes empty. The depopulation is performed in the reverse direction to
>>> keep populated pages close to the beginning. Depopulated chunks are
>>> sidelined to preferentially avoid them for new allocations. When no
>>> active chunk can suffice a new allocation, sidelined chunks are first
>>> checked before creating a new chunk.
>>>
>>> Signed-off-by: Roman Gushchin <guro@...com>
>>> Co-developed-by: Dennis Zhou <dennis@...nel.org>
>>> Signed-off-by: Dennis Zhou <dennis@...nel.org>
>>
>> This patch results in a number of crashes and other odd behavior
>> when trying to boot mips images from Megasas controllers in qemu.
>> Sometimes the boot stalls, but I also see various crashes.
>> Some examples and bisect logs are attached.
>
> Ah, this doesn't look good.. Do you have a reproducer I could use to
> debug this?
>
I copied the relevant information to http://server.roeck-us.net/qemu/mips/.
run.sh - qemu command (I tried with qemu 6.0 and 4.2.1)
rootfs.ext2 - root file system
config - complete configuration
defconfig - shortened configuration
vmlinux - a crashing kernel image (v5.13-7637-g3dbdb38e2869, with above configuration)
Interestingly, the crash doesn't always happen at the same location, even
with the same image. Some memory corruption, maybe ?
Hope this helps. Please let me know if I can provide anything else.
Thanks,
Guenter
Powered by blists - more mailing lists