[<prev] [next>] [day] [month] [year] [list]
Message-ID: <OFC9FBB3AB.B34DAD21-ON00258789.005D2811-85258789.005DF5B3@ibm.com>
Date: Wed, 10 Nov 2021 13:06:17 -0400
From: "Harish Mara" <Harish.Mara@....com>
To: linux-kernel@...r.kernel.org
Cc: "Aniket Kulkarni" <aniket.kulkarni@...ibm.com>,
"Rajshekar Iyer" <iyerr@...ibm.com>,
"Pawan Powar" <ppowar@...ibm.com>, ask@...ux.vnet.ibm.com
Subject: [BUG: Bad page map in process XXXXX pte:8000000e72680867 pmd:ff48e6067]
Application is getting bad data when trying to mmap memory allocated by
kernel device drivers
We have kernel drivers that allocate memory using ?alloc_pages_node?, the
size of the memory allocation is fixed at 128KB.
The pages are allocated with ?GFP_KERNEL | _GFP_COMP?. And the pages thus
allocated are marked as Reserved.
The driver does multiple such allocations at the beginning. This memory is
reused for various requests the driver handles.
The linux kernel device drivers also create char devices. The char devices
are initialized with our own file_operations that overload owner, read,
poll, unlocked_ioctl, mmap, open, release, compat_ioctl.
The mmap also registers vm_operations_struct to the vm_area_struct and
sets the VM_DONTEXPAND flag. The vm_operations_struct.fault implementation
finds the appropriate page and increments the refcount and sets the
vmf->page.
The user space processes open these device file and mmap the address
range. The size of mmap could be a single allocation (128KB) or multiple
allocations.
The problem we are facing is when the user space mmap?s a size of 2MB.
Sometimes the memory that gets mapped is garbage (not correct) and we
always notice ?Bad page map? errors.
P.S: To solve this issue, we removed the ?_GFP_COMP? flag from allocation.
However this created a different problem on some cloud instances, we are
seeing the ?Bad page? errors during the memory allocation that happens at
the initialization phase.
Powered by blists - more mailing lists