lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CABRLg09mnZSvoXUJEJyvXNpBq76UsWtP8rTcKk98GkK68P1CVA@mail.gmail.com>
Date: Tue, 19 Aug 2025 15:45:01 +0200
From: Bartel Eerdekens <bartel.eerdekens@...stell8.be>
To: netdev@...r.kernel.org
Cc: nbd@....name, Sean Wang <sean.wang@...iatek.com>, lorenzo@...nel.org
Subject: mediatek ethernet driver: 4MB buffer allocation

Hi,

I recently upstepped from kernel 6.1 to 6.12 for my MT7621-based
device and suddenly observed a kernel memory allocation error.
I am using a MT621DAT (so, MT7621AT + 128MB embedded RAM), with quite
a heavy build (buildroot + systemd build) , so my memory usage is
already quite high.
But anyway, the 4MB allocation that is happening, seems like a very
high amount of continuous memory that is required?

The error log:

[ 3011.709726] systemd-network: page allocation failure: order:10,
mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null)
[ 3011.725868] CPU: 3 UID: 101 PID: 4191 Comm: systemd-network Not
tainted 6.12.39 #8
[ 3011.725931] Hardware name: MT7621DAT
[ 3011.725948] Stack : 00000001 8009888c 00000001 00000004 00000001
83c41750 83c417c4 00000000
[ 3011.726039]         01000000 80097b10 00000000 00000000 00000000
00000001 83c41770 814c8000
[ 3011.726099]         00000000 00000000 80a493d8 83c41620 ffffefff
00000000 80b0c00c 000001ef
[ 3011.726155]         00000000 000001f1 80b0c038 fffffff9 00000001
00000000 80a493d8 00000001
[ 3011.726211]         00000001 80b04574 00000000 00000400 00000003
fffc7fb3 0000000c 80cd000c
[ 3011.726269]         ...
[ 3011.726284] Call Trace:
[ 3011.726291] [<80008430>] show_stack+0x28/0xf0
[ 3011.726352] [<8091ce10>] dump_stack_lvl+0x70/0xb0
[ 3011.726387] [<801e3e6c>] warn_alloc+0xb8/0x148
[ 3011.726439] [<801e406c>] __alloc_pages_noprof+0x170/0xd04
[ 3011.726466] [<801e9ba4>] ___kmalloc_large_node+0x64/0xf8
[ 3011.726496] [<801ee0a0>] __kmalloc_noprof+0x22c/0x3c0
[ 3011.726520] [<805d93d4>] mtk_open+0xb20/0xcb8
[ 3011.726542] [<806dfe48>] __dev_open+0xd8/0x198
[ 3011.726569] [<806e0338>] __dev_change_flags+0x1c0/0x208
[ 3011.726591] [<806e03a4>] dev_change_flags+0x24/0x70
[ 3011.726610] [<806f4aa4>] do_setlink+0x2d4/0x102c
[ 3011.726638] [<806f58d4>] rtnl_setlink+0xd8/0x154
[ 3011.726658] [<806f2890>] rtnetlink_rcv_msg+0x350/0x47c
[ 3011.726679] [<80746eb0>] netlink_rcv_skb+0x94/0x130
[ 3011.726711] [<80746578>] netlink_unicast+0x284/0x448
[ 3011.726733] [<807469d0>] netlink_sendmsg+0x294/0x460
[ 3011.726755] [<806a76c4>] __sys_sendto+0xbc/0x120
[ 3011.726792] [<800138cc>] syscall_common+0x34/0x58
[ 3011.726828]
[ 3011.726842] Mem-Info:
[ 3011.878151] active_anon:51 inactive_anon:2644 isolated_anon:0
[ 3011.878151]  active_file:2379 inactive_file:4276 isolated_file:32
[ 3011.878151]  unevictable:0 dirty:135 writeback:0
[ 3011.878151]  slab_reclaimable:680 slab_unreclaimable:4961
[ 3011.878151]  mapped:3287 shmem:553 pagetables:142
[ 3011.878151]  sec_pagetables:0 bounce:0
[ 3011.878151]  kernel_misc_reclaimable:0
[ 3011.878151]  free:11393 free_pcp:103 free_cma:0
[ 3011.916762] Node 0 active_anon:204kB inactive_anon:10828kB
active_file:9516kB inactive_file:17132kB unevictable:0kB
isolated(anon):0kB isolated(file):128kB mapped:13344kB dirty:540kB
writeback:0kB shmem:2212kB writeback_tmp:0kB kernel_stack:984kB
pagetables:568kB sec_pagetables:0kB all_unreclaimable? no
[ 3011.916846] Normal free:45132kB boost:0kB min:1360kB low:1700kB
high:2040kB reserved_highatomic:0KB active_anon:204kB
inactive_anon:10740kB active_file:9584kB inactive_file:17168kB
unevictable:0kB writepending:524kB present:131072kB managed:117500kB
mlocked:0kB bounce:0kB free_pcp:520kB local_pcp:0kB free_cma:0kB
[ 3011.916904] lowmem_reserve[]: 0 0 0
[ 3011.916957] Normal: 145*4kB (UE) 155*8kB (UME) 238*16kB (UME)
231*32kB (UME) 106*64kB (UME) 51*128kB (UME) 27*256kB (M) 15*512kB (M)
2*1024kB (M) 1*2048kB (M) 0*4096kB = 45020kB
[ 3011.917229] 7274 total pagecache pages
[ 3011.917247] 0 pages in swap cache
[ 3011.917260] Free swap  = 0kB
[ 3011.917272] Total swap = 0kB
[ 3011.917284] 32768 pages RAM
[ 3011.917296] 0 pages HighMem/MovableOnly
[ 3011.917309] 3393 pages reserved

As observed, a block of 4096kB of continuous memory is being allocated
(order: 10), which is not available at that time (fragmented).

The error happens in mtk_open and more specifically in the mtk_init_fq_dma call.
There this [1] kcalloc memory allocation happens to allocate a buffer
for eth->scratch_head .

This part of the code was changed last year [2] where the original
single kcalloc was wrapped in a for-loop. The patch was fetched from
the upstream mediatek repo [3].
Now this for loop runs, for the MT7621, just one time, but with a high
amount of requested memory, resolving in a call: eth->scratch_head[0]
= kcalloc(2048, 2048, GFP_KERNEL);
As MTK_DMA_SIZE(2K) = 2048 for MT7621.

The old code only allocated 512kB, as: eth->scratch_head =
kcalloc(cnt, MTK_QDMA_PAGE_SIZE, GFP_KERNEL);
where cnt = #define MTK_DMA_SIZE 256 , and size was: #define
MTK_QDMA_PAGE_SIZE 2048


My interpretation of the change is that it is intended to split up the
fq_dma_size into chunks of MTK_FQ_DMA_LENGTH.
In that case, only 1 element of size 2048 should have been allocated.
Am I correct in this assumption?


I traced some kcalloc calls in the mtk_eth_soc.c code to pinpoint
where this big allocation happens, and this was the output of that
manual tracing:

Allocating 2048 × 16 = 32768 bytes <-- `dma_alloc_coherent` for
scratch_ring [mtk_init_fq_dma]
Allocating 2048 × 2048 = 4194304 bytes <-- `kcalloc` for scratch_head
[mtk_init_fq_dma]
Allocating TX 2048 × 28 = 57344 bytes <-- `dma_alloc_coherent` [mtk_tx_alloc]
Allocating RX 512 × 4 = 2048 bytes <-- `dma_alloc_coherent` for
MTK_RX_FLAGS_QDMA [mtk_rx_alloc]
Allocating RX 512 × 4 = 2048 bytes <-- `dma_alloc_coherent` for
MTK_RX_FLAGS_NORMAL [mtk_rx_alloc]

I found a bug report for this from 2021 [4], but it was posted in the
wrong project. This got me to a reproducible test-case scenario to
trigger the memory allocation failure:

ip link set down eth0
ip link del dev br0

nice -n -10 stress --vm 1 --vm-bytes 10000000 &
nice -n -10 stress --vm 1 --vm-bytes 50000000 &

Wait for a few seconds, and then run:

killall stress
systemctl restart systemd-networkd
(Or /etc/init.d/network restart if you are running init.d , I am using systemd).

The above trace is logged and the failure of bringing up the eth0 device:

[   51.912288] mt7530-mdio mdio-bus:1f lan3: failed to open conduit eth0
[   51.920335] mt7530-mdio mdio-bus:1f lan5: failed to open conduit eth0
[   51.928359] mt7530-mdio mdio-bus:1f lan2: failed to open conduit eth0
[   51.936288] mt7530-mdio mdio-bus:1f lan1: failed to open conduit eth0
[   51.943893] mt7530-mdio mdio-bus:1f lan4: failed to open conduit eth0


Any maintainer that can help clarify this memory allocation?

Thanks a lot.

[1] https://github.com/torvalds/linux/blob/v6.12/drivers/net/ethernet/mediatek/mtk_eth_soc.c#L1162
[2] https://github.com/torvalds/linux/commit/c57e558194430d10d5e5f4acd8a8655b68dade13
[3] https://git01.mediatek.com/plugins/gitiles/openwrt/feeds/mtk-openwrt-feeds/+/ba89e7797868ba793e1bf2a468bbae68ab8d311a
[4] https://github.com/openwrt/mt76/issues/592


Kind regards.
Bartel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ