lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 10 Nov 2021 13:06:17 -0400
From:   "Harish Mara" <Harish.Mara@....com>
To:     linux-kernel@...r.kernel.org
Cc:     "Aniket Kulkarni" <aniket.kulkarni@...ibm.com>,
        "Rajshekar Iyer" <iyerr@...ibm.com>,
        "Pawan Powar" <ppowar@...ibm.com>, ask@...ux.vnet.ibm.com
Subject: [BUG: Bad page map in process XXXXX  pte:8000000e72680867 pmd:ff48e6067]
 Application is getting bad data when trying to mmap memory allocated by
 kernel device drivers

We have kernel drivers that allocate memory using ?alloc_pages_node?, the 
size of the memory allocation is fixed at 128KB.
The pages are allocated with ?GFP_KERNEL | _GFP_COMP?. And the pages thus 
allocated are marked as Reserved.
The driver does multiple such allocations at the beginning. This memory is 
reused for various requests the driver handles.
The linux kernel device drivers also create char devices. The char devices 
are initialized with our own file_operations that overload owner, read, 
poll, unlocked_ioctl, mmap, open, release, compat_ioctl.
The mmap also registers vm_operations_struct to the vm_area_struct and 
sets the VM_DONTEXPAND flag. The vm_operations_struct.fault implementation 
finds the appropriate page and increments the refcount and sets the 
vmf->page.
The user space processes open these device file and mmap the address 
range. The size of mmap could be a single allocation (128KB) or multiple 
allocations.

The problem we are facing is when the user space mmap?s a size of 2MB. 
Sometimes the memory that gets mapped is garbage (not correct) and we 
always notice ?Bad page map? errors. 

P.S: To solve this issue, we removed the ?_GFP_COMP? flag from allocation. 
However this created a different problem on some cloud instances, we are 
seeing the ?Bad page? errors during the memory allocation that happens at 
the initialization phase.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ