lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0adb508f-480d-4bfc-b861-3cf42e87bee1@gmail.com>
Date: Wed, 14 Jan 2026 14:10:42 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Yuhao Jiang <danisjiang@...il.com>, Jens Axboe <axboe@...nel.dk>
Cc: io-uring@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] io_uring/rsrc: fix RLIMIT_MEMLOCK bypass via compound
 page accounting

On 1/13/26 19:44, Pavel Begunkov wrote:
> On 1/9/26 03:02, Yuhao Jiang wrote:
>> Hi Jens, Pavel, and all,
>>
>> Just a gentle follow-up on this patch below.
>> Please let me know if there are any concerns or if changes are needed.
> 
> I'm pretty this will break with buffer sharing / cloning. I'd
> be tempted to remove all this cross buffer accounting logic
> and overestimate it, the current accounting is not sane.
> Otherwise, it'll likely need some proxy object shared b/w
> buffers or some other overly overcomplicated solution

Another way would be to double account cloned buffers and then
have your patch, which combines overaccounting with the ugliness
of full buffer table walks.

>> On Wed, Dec 17, 2025 at 9:00 PM Yuhao Jiang <danisjiang@...il.com> wrote:
>>>
>>> When multiple registered buffers share the same compound page, only the
>>> first buffer accounts for the memory via io_buffer_account_pin(). The
>>> subsequent buffers skip accounting since headpage_already_acct() returns
>>> true.
>>>
>>> When the first buffer is unregistered, the accounting is decremented,
>>> but the compound page remains pinned by the remaining buffers. This
>>> creates a state where pinned memory is not properly accounted against
>>> RLIMIT_MEMLOCK.
>>>
>>> On systems with HugeTLB pages pre-allocated, an unprivileged user can
>>> exploit this to pin memory beyond RLIMIT_MEMLOCK by cycling buffer
>>> registrations. The bypass amount is proportional to the number of
>>> available huge pages, potentially allowing gigabytes of memory to be
>>> pinned while the kernel accounting shows near-zero.
>>>
>>> Fix this by recalculating the actual pages to unaccount when unmapping
>>> a buffer. For regular pages, always unaccount. For compound pages, only
>>> unaccount if no other registered buffer references the same compound
>>> page. This ensures the accounting persists until the last buffer
>>> referencing the compound page is released.
>>>
>>> Reported-by: Yuhao Jiang <danisjiang@...il.com>
>>> Fixes: 57bebf807e2a ("io_uring/rsrc: optimise registered huge pages")
> 
> That's not the right commit, the accounting is ancient, should
> get blamed somewhere around first commits that added registered
> buffers.

Turns it came just a bit later:

commit de2939388be564836b06f0f06b3787bdedaed822
Author: Jens Axboe <axboe@...nel.dk>
Date:   Thu Sep 17 16:19:16 2020 -0600

     io_uring: improve registered buffer accounting for huge pages

-- 
Pavel Begunkov


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ