linux-kernel - Re: [RFC] Order 4 allocation failures in the MEI client driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <25e56199-7af1-4235-8973-cbc351325b8c@app.fastmail.com>
Date: Wed, 21 Aug 2024 11:17:54 +0000
From: "Arnd Bergmann" <arnd@...db.de>
To: "Rohit Agarwal" <rohiagar@...omium.org>,
 "Tomas Winkler" <tomas.winkler@...el.com>,
 "Greg Kroah-Hartman" <gregkh@...uxfoundation.org>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [RFC] Order 4 allocation failures in the MEI client driver

On Wed, Aug 21, 2024, at 05:20, Rohit Agarwal wrote:
> On 19/08/24 6:45 PM, Arnd Bergmann wrote:
>> On Tue, Aug 13, 2024, at 10:45, Rohit Agarwal wrote:
>> 
>> What is the call chain you see in the kernel messages? Is it
>> always the same?
> Yes the call stack is same everytime. This is the call stack
>
> <4>[ 2019.101352] dump_stack_lvl+0x69/0xa0
> <4>[ 2019.101359] warn_alloc+0x10d/0x180
> <4>[ 2019.101363] __alloc_pages_slowpath+0xe3d/0xe80
> <4>[ 2019.101366] __alloc_pages+0x22f/0x2b0
> <4>[ 2019.101369] __kmalloc_large_node+0x9d/0x120
> <4>[ 2019.101373] ? mei_cl_alloc_cb+0x34/0xa0
> <4>[ 2019.101377] ? mei_cl_alloc_cb+0x74/0xa0
> <4>[ 2019.101379] __kmalloc+0x86/0x130
> <4>[ 2019.101382] mei_cl_alloc_cb+0x74/0xa0
> <4>[ 2019.101385] mei_cl_enqueue_ctrl_wr_cb+0x38/0x90

Ok, so this might be a result of mei_cl_enqueue_ctrl_wr_cb()
doing

        /* for RX always allocate at least client's mtu */
        if (length)
                length = max_t(size_t, length, mei_cl_mtu(cl));

which was added in 3030dc056459 ("mei: add wrapper
for queuing control commands."). All the callers seem
to be passing a short "length" of just a few bytes,
but this would always extend it to 
cl->me_cl->props.max_msg_length in mei_cl_mtu().

Not sure where that part is set.

> <4>[ 2019.101388] mei_cl_read_start+0xb8/0x230
> <4>[ 2019.101391] __mei_cl_recv+0xd3/0x400
> <4>[ 2019.101396] ? __pfx_autoremove_wake_function+0x10/0x10
> <4>[ 2019.101399] mei_pxp_receive_message+0x39/0x60
> <4>[ 2019.101402] intel_pxp_tee_io_message+0x112/0x1e0
> <4>[ 2019.101407] i915_pxp_ops_ioctl+0x536/0x6c0

Curiously, I don't see any evidence of i915_pxp_ops_ioctl()
ever making it into mainline kernels, though I see some
discussion about it on the mailing lists [1]

Do you see the same problem with a mainline kernel?

The only reference I could find to the DRM_IOCTL_I915_PXP_OPS
ioctl in userspace seems to be in
https://chromium.googlesource.com/chromium/src/+/a4de986102a45e29c3ef596f22704bdca244c26c/media/gpu/vaapi/vaapi_wrapper.cc#2004

>> Allocating 64KB of consecutive pages repeatedly is clearly
>> a problem at runtime, but having a single allocation during
>> probe time is not as bad.
> What if the length of the message is greater than 64KB, wouldn't that 
> be an issue?

That would make it an order-5 allocation. 

     Arnd


[1] https://patchwork.kernel.org/project/intel-gfx/patch/20201114014537.25495-5-sean.z.huang@intel.com/#23762967