[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87iknw9duu.fsf@waldekranz.com>
Date: Wed, 26 Mar 2025 13:27:05 +0100
From: Tobias Waldekranz <tobias@...dekranz.com>
To: Maxime Chevallier <maxime.chevallier@...tlin.com>
Cc: davem@...emloft.net, kuba@...nel.org, marcin.s.wojtas@...il.com,
linux@...linux.org.uk, andrew@...n.ch, edumazet@...gle.com,
pabeni@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH v3 net] net: mvpp2: Prevent parser TCAM memory corruption
On ons, mar 26, 2025 at 12:07, Maxime Chevallier <maxime.chevallier@...tlin.com> wrote:
> Hi Tobias,
>
> On Wed, 26 Mar 2025 11:37:33 +0100
> Tobias Waldekranz <tobias@...dekranz.com> wrote:
>
>> Protect the parser TCAM/SRAM memory, and the cached (shadow) SRAM
>> information, from concurrent modifications.
>>
>> Both the TCAM and SRAM tables are indirectly accessed by configuring
>> an index register that selects the row to read or write to. This means
>> that operations must be atomic in order to, e.g., avoid spreading
>> writes across multiple rows. Since the shadow SRAM array is used to
>> find free rows in the hardware table, it must also be protected in
>> order to avoid TOCTOU errors where multiple cores allocate the same
>> row.
>>
>> This issue was detected in a situation where `mvpp2_set_rx_mode()` ran
>> concurrently on two CPUs. In this particular case the
>> MVPP2_PE_MAC_UC_PROMISCUOUS entry was corrupted, causing the
>> classifier unit to drop all incoming unicast - indicated by the
>> `rx_classifier_drops` counter.
>>
>> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
>> Signed-off-by: Tobias Waldekranz <tobias@...dekranz.com>
>> ---
>>
>> @Andrew: I did finally manage to trigger sparse warnings that could be
>> silenced with __must_hold() annotations, but I still do not understand
>> how they work. I went back to the change that pulled this in:
>>
>> https://lore.kernel.org/all/C5833F40-2EA6-43DA-B69C-AFF59E76E0C9@coraid.com/T/
>>
>> The referenced function (tx()), still exists in aoenet.c. Using that
>> as a template, I could construct an unlock+lock sequence that
>> triggered a warning without __must_hold(). For example...
>>
>> spin_unlock_bh(&priv->prs_spinlock);
>> if (net_ratelimit())
>> schedule();
>> spin_lock_bh(&priv->prs_spinlock);
>>
>> ...would generate a warning. But this...
>>
>> spin_unlock_bh(&priv->prs_spinlock);
>> net_ratelimit();
>> schedule();
>> spin_lock_bh(&priv->prs_spinlock);
>>
>> ...would not.
>>
>> Reading through the sparse validation suite, it does not seem to have
>> any tests that covers this either:
>>
>> https://web.git.kernel.org/pub/scm/devel/sparse/sparse.git/tree/validation/context.c
>>
>> Therefore, I decided to take Jakub's advise and add lockdep assertions
>> instead. That necessitated some more changes, since tables are updated
>> in the init phase (where I originally omitted locking).
>>
>> @Maxime: There was enough of a diff between v2->v3 that I did not feel
>> comfortable including your signoff/testing tags. Would it be possible
>> for you to run your tests again on this version?
>
> Sure thing, although I do have some comments :)
>
> [...]
>
>> /* Parser default initialization */
>> @@ -2118,6 +2163,8 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
>> {
>> int err, index, i;
>>
>> + spin_lock_bh(&priv->prs_spinlock);
>> +
>> /* Enable tcam table */
>> mvpp2_write(priv, MVPP2_PRS_TCAM_CTRL_REG, MVPP2_PRS_TCAM_EN_MASK);
>>
>> @@ -2139,8 +2186,10 @@ int mvpp2_prs_default_init(struct platform_device *pdev, struct mvpp2 *priv)
>> priv->prs_shadow = devm_kcalloc(&pdev->dev, MVPP2_PRS_TCAM_SRAM_SIZE,
>> sizeof(*priv->prs_shadow),
>> GFP_KERNEL);
>
> GFP_KERNEL alloc while holding a spinlock isn't correct and triggers a
> splat when building when CONFIG_DEBUG_ATOMIC_SLEEP :
I think I had pretty much every other debug flag enabled in my config :)
Thanks for catching this!
> [ 4.380325] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:321
> [ 4.389217] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
> [ 4.397120] preempt_count: 201, expected: 0
> [ 4.401358] RCU nest depth: 0, expected: 0
> [ 4.405507] 2 locks held by swapper/0/1:
> [ 4.409488] #0: ffff000100e168f8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0x8c/0x1ac
> [ 4.417971] #1: ffff00010ae15368 (&priv->prs_spinlock){+...}-{3:3}, at: mvpp2_prs_default_init+0x50/0x1570
> [ 4.427843] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc7-01963-g02bf787e4750 #68
> [ 4.427851] Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
> [ 4.427855] Call trace:
> [ 4.427858] show_stack+0x18/0x24 (C)
> [ 4.427867] dump_stack_lvl+0xd8/0xf0
> [ 4.427875] dump_stack+0x18/0x24
> [ 4.427880] __might_resched+0x148/0x24c
> [ 4.427890] __might_sleep+0x48/0x7c
> [ 4.427897] __kmalloc_node_track_caller_noprof+0x200/0x480
> [ 4.427903] devm_kmalloc+0x54/0x118
> [ 4.427910] mvpp2_prs_default_init+0x138/0x1570
> [ 4.427919] mvpp2_probe+0x904/0xfa4
> [ 4.427926] platform_probe+0x68/0xc8
> [...]
>
> I suggest you move that alloc and associated error handling outside of
> the spinlock.
Will do. Sorry for the noise. I have fixed this locally - will send v4
as soon as the rules permit.
Powered by blists - more mailing lists