linux-kernel - Re: [PATCH AUTOSEL 6.7 021/108] r8169: improve RTL8411b phy-down fixup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4523ad21-d06a-4ba2-9b46-974a6093b189@alu.unizg.hr>
Date: Wed, 17 Jan 2024 11:30:53 +0100
From: Mirsad Todorovac <mirsad.todorovac@....hr>
To: Jakub Kicinski <kuba@...nel.org>, Sasha Levin <sashal@...nel.org>
Cc: linux-kernel@...r.kernel.org, stable@...r.kernel.org,
 Heiner Kallweit <hkallweit1@...il.com>,
 Mirsad Todorovac <mirsad.todorovac@....unizg.hr>,
 Simon Horman <horms@...nel.org>, "David S . Miller" <davem@...emloft.net>,
 nic_swsd@...ltek.com, edumazet@...gle.com, pabeni@...hat.com,
 netdev@...r.kernel.org
Subject: Re: [PATCH AUTOSEL 6.7 021/108] r8169: improve RTL8411b phy-down
 fixup

On 1/17/24 02:43, Jakub Kicinski wrote:
> On Tue, 16 Jan 2024 14:38:47 -0500 Sasha Levin wrote:
>> Mirsad proposed a patch to reduce the number of spinlock lock/unlock
>> operations and the function code size. This can be further improved
>> because the function sets a consecutive register block.
> 
> Clearly a noop and a lot of LoC changed. I vote to drop this from
> the backport.

Dear Jakub,

I will not argue with a senior developer, but please let me plead for the
cause.

There are a couple of issues here:

1. Heiner's patch generates smaller and faster code, with 100+
spin_lock_irqsave()/spin_unlock_restore() pairs less.

According to this table:

[1] https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook-1c.2023.06.11a.pdf#table.3.1

The cost of single lock can be 15.4 - 101.9 ns (for the example CPU),
so total savings would be 1709 - 11310 ns. But as the event of PHY power
down is not frequent, this might be a insignificant saving indeed.

2. Why I had advertised atomic programming of RTL registers in the first
place?

The mac_ocp_lock was introduced recently:

commit 91c8643578a21e435c412ffbe902bb4b4773e262
Author: Heiner Kallweit <hkallweit1@...il.com>
Date:   Mon Mar 6 22:23:15 2023 +0100

     r8169: use spinlock to protect mac ocp register access

     For disabling ASPM during NAPI poll we'll have to access mac ocp
     registers in atomic context. This could result in races because
     a mac ocp read consists of a write to register OCPDR, followed
     by a read from the same register. Therefore add a spinlock to
     protect access to mac ocp registers.

     Reviewed-by: Simon Horman <simon.horman@...igine.com>
     Tested-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
     Tested-by: Holger Hoffstätte <holger@...lied-asynchrony.com>
     Signed-off-by: Heiner Kallweit <hkallweit1@...il.com>
     Signed-off-by: David S. Miller <davem@...emloft.net>

Well, the answer is in the question - the very need for protecting the access
to RTL_W(8|16|32) with locks comes from the fact that something was accessing
the RTL card asynchronously.

Forgive me if this is a stupid question ...

Now - do we have a guarantee that the card will not be used asynchronously
half-programmed from something else in that case, leading to another spurious
lockup?

IMHO, shouldn't the entire reprogramming of PHY down recovery of the RTL 8411b
be done atomically, under a single spin_lock_irqsave()/spin_unlock_irqrestore()
pair?

Best regards,
Mirsad Todorovac

-- 
CARNet system engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb

CARNet sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu