[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7fd8c40d-5209-4f7c-8c69-5138d0eb0bc5@gmail.com>
Date: Mon, 30 Oct 2023 15:02:17 +0100
From: Heiner Kallweit <hkallweit1@...il.com>
To: Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>,
Jason Gunthorpe <jgg@...pe.ca>, Joerg Roedel <jroedel@...e.de>,
Lu Baolu <baolu.lu@...ux.intel.com>, iommu@...ts.linux.dev,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Cc: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
Robin Murphy <robin.murphy@....com>, nic_swsd@...ltek.com,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Marco Elver <elver@...gle.com>
Subject: Re: [PATCH v5 6/7] r8169: Coalesce mac ocp write and modify for 8125
and 8125B start to reduce spinlocks
On 29.10.2023 19:36, Mirsad Goran Todorovac wrote:
> Repeated calls to r8168_mac_ocp_write() and r8168_mac_ocp_modify() in
> the startup of 8125 and 8125B involve implicit spin_lock_irqsave() and
> spin_unlock_irqrestore() on each invocation.
>
> Coalesced with the corresponding helpers r8168_mac_ocp_write_seq() and
> r8168_mac_ocp_modify_seq() into sequential write or modidy with a sinqle
> pair of spin_lock_irqsave() and spin_unlock_irqrestore(), these calls
> reduce overall lock contention.
>
> Fixes: f1bce4ad2f1ce ("r8169: add support for RTL8125")
> Fixes: 0439297be9511 ("r8169: add support for RTL8125B")
> Cc: Heiner Kallweit <hkallweit1@...il.com>
> Cc: Marco Elver <elver@...gle.com>
> Cc: nic_swsd@...ltek.com
> Cc: "David S. Miller" <davem@...emloft.net>
> Cc: Eric Dumazet <edumazet@...gle.com>
> Cc: Jakub Kicinski <kuba@...nel.org>
> Cc: Paolo Abeni <pabeni@...hat.com>
> Cc: netdev@...r.kernel.org
> Cc: linux-kernel@...r.kernel.org
> Link: https://lore.kernel.org/lkml/20231028005153.2180411-1-mirsad.todorovac@alu.unizg.hr/
> Link: https://lore.kernel.org/lkml/20231028110459.2644926-1-mirsad.todorovac@alu.unizg.hr/
> Signed-off-by: Mirsad Goran Todorovac <mirsad.todorovac@....unizg.hr>
> ---
> v5:
> added unlocked primitives to allow mac ocs modify grouping
> applied coalescing of mac ocp writes/modifies for 8168ep and 8117
> some formatting fixes to please checkpatch.pl
>
> v4:
> fixed complaints as advised by Heiner and checkpatch.pl
> split the patch into five sections to be more easily manipulated and reviewed
> introduced r8168_mac_ocp_write_seq()
> applied coalescing of mac ocp writes/modifies for 8168H, 8125 and 8125B
>
> v3:
> removed register/mask pair array sentinels, so using ARRAY_SIZE().
> avoided duplication of RTL_W32() call code as advised by Heiner.
>
> drivers/net/ethernet/realtek/r8169_main.c | 75 +++++++++++++----------
> 1 file changed, 44 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 50fbacb05953..0778cd0ba2e0 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -3553,6 +3553,28 @@ DECLARE_RTL_COND(rtl_mac_ocp_e00e_cond)
>
> static void rtl_hw_start_8125_common(struct rtl8169_private *tp)
> {
> + static const struct e_info_regmaskset e_info_8125_common_1[] = {
> + { 0xd3e2, 0x0fff, 0x03a9 },
> + { 0xd3e4, 0x00ff, 0x0000 },
> + { 0xe860, 0x0000, 0x0080 },
> + };
> +
> + static const struct e_info_regmaskset e_info_8125_common_2[] = {
> + { 0xc0b4, 0x0000, 0x000c },
> + { 0xeb6a, 0x00ff, 0x0033 },
> + { 0xeb50, 0x03e0, 0x0040 },
> + { 0xe056, 0x00f0, 0x0030 },
> + { 0xe040, 0x1000, 0x0000 },
> + { 0xea1c, 0x0003, 0x0001 },
> + { 0xe0c0, 0x4f0f, 0x4403 },
> + { 0xe052, 0x0080, 0x0068 },
> + { 0xd430, 0x0fff, 0x047f },
> + { 0xea1c, 0x0004, 0x0000 },
> + { 0xeb54, 0x0000, 0x0001 },
> + };
> +
> + unsigned long flags;
> +
> rtl_pcie_state_l2l3_disable(tp);
>
> RTL_W16(tp, 0x382, 0x221b);
> @@ -3560,47 +3582,38 @@ static void rtl_hw_start_8125_common(struct rtl8169_private *tp)
> RTL_W16(tp, 0x4800, 0);
>
> /* disable UPS */
> - r8168_mac_ocp_modify(tp, 0xd40a, 0x0010, 0x0000);
> +
> + raw_spin_lock_irqsave(&tp->mac_ocp_lock, flags);
> + __r8168_mac_ocp_modify(tp, 0xd40a, 0x0010, 0x0000);
>
> RTL_W8(tp, Config1, RTL_R8(tp, Config1) & ~0x10);
>
> - r8168_mac_ocp_write(tp, 0xc140, 0xffff);
> - r8168_mac_ocp_write(tp, 0xc142, 0xffff);
> + __r8168_mac_ocp_write(tp, 0xc140, 0xffff);
> + __r8168_mac_ocp_write(tp, 0xc142, 0xffff);
>
> - r8168_mac_ocp_modify(tp, 0xd3e2, 0x0fff, 0x03a9);
> - r8168_mac_ocp_modify(tp, 0xd3e4, 0x00ff, 0x0000);
> - r8168_mac_ocp_modify(tp, 0xe860, 0x0000, 0x0080);
> + __r8168_mac_ocp_modify_seq(tp, e_info_8125_common_1);
>
> /* disable new tx descriptor format */
> - r8168_mac_ocp_modify(tp, 0xeb58, 0x0001, 0x0000);
> + __r8168_mac_ocp_modify(tp, 0xeb58, 0x0001, 0x0000);
>
> - if (tp->mac_version == RTL_GIGA_MAC_VER_63)
> - r8168_mac_ocp_modify(tp, 0xe614, 0x0700, 0x0200);
> - else
> - r8168_mac_ocp_modify(tp, 0xe614, 0x0700, 0x0400);
> + if (tp->mac_version == RTL_GIGA_MAC_VER_63) {
> + __r8168_mac_ocp_modify(tp, 0xe614, 0x0700, 0x0200);
> + __r8168_mac_ocp_modify(tp, 0xe63e, 0x0c30, 0x0000);
> + } else {
> + __r8168_mac_ocp_modify(tp, 0xe614, 0x0700, 0x0400);
> + __r8168_mac_ocp_modify(tp, 0xe63e, 0x0c30, 0x0020);
> + }
> +
> + __r8168_mac_ocp_modify_seq(tp, e_info_8125_common_2);
> + raw_spin_unlock_irqrestore(&tp->mac_ocp_lock, flags);
>
> - if (tp->mac_version == RTL_GIGA_MAC_VER_63)
> - r8168_mac_ocp_modify(tp, 0xe63e, 0x0c30, 0x0000);
> - else
> - r8168_mac_ocp_modify(tp, 0xe63e, 0x0c30, 0x0020);
> -
> - r8168_mac_ocp_modify(tp, 0xc0b4, 0x0000, 0x000c);
> - r8168_mac_ocp_modify(tp, 0xeb6a, 0x00ff, 0x0033);
> - r8168_mac_ocp_modify(tp, 0xeb50, 0x03e0, 0x0040);
> - r8168_mac_ocp_modify(tp, 0xe056, 0x00f0, 0x0030);
> - r8168_mac_ocp_modify(tp, 0xe040, 0x1000, 0x0000);
> - r8168_mac_ocp_modify(tp, 0xea1c, 0x0003, 0x0001);
> - r8168_mac_ocp_modify(tp, 0xe0c0, 0x4f0f, 0x4403);
> - r8168_mac_ocp_modify(tp, 0xe052, 0x0080, 0x0068);
> - r8168_mac_ocp_modify(tp, 0xd430, 0x0fff, 0x047f);
> -
> - r8168_mac_ocp_modify(tp, 0xea1c, 0x0004, 0x0000);
> - r8168_mac_ocp_modify(tp, 0xeb54, 0x0000, 0x0001);
> udelay(1);
> - r8168_mac_ocp_modify(tp, 0xeb54, 0x0001, 0x0000);
> - RTL_W16(tp, 0x1880, RTL_R16(tp, 0x1880) & ~0x0030);
>
> - r8168_mac_ocp_write(tp, 0xe098, 0xc302);
> + raw_spin_lock_irqsave(&tp->mac_ocp_lock, flags);
> + __r8168_mac_ocp_modify(tp, 0xeb54, 0x0001, 0x0000);
> + RTL_W16(tp, 0x1880, RTL_R16(tp, 0x1880) & ~0x0030);
> + __r8168_mac_ocp_write(tp, 0xe098, 0xc302);
> + raw_spin_unlock_irqrestore(&tp->mac_ocp_lock, flags);
>
> rtl_loop_wait_low(tp, &rtl_mac_ocp_e00e_cond, 1000, 10);
>
All this manual locking and unlocking makes the code harder
to read and more error-prone. Maybe, as a rule of thumb:
If you can replace a block with more than 10 mac ocp ops,
then fine with me.
Powered by blists - more mailing lists