linux-kernel - Re: [Intel-wired-lan] [PATCH v2 1/2] e1000e: let the sleep codes run every time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d2d9c0a8-6d4f-4aff-84f3-35fc2bff49b7@intel.com>
Date: Tue, 7 May 2024 09:21:53 +0300
From: "Ruinskiy, Dima" <dima.ruinskiy@...el.com>
To: En-Wei WU <en-wei.wu@...onical.com>, Sasha Neftin <sasha.neftin@...el.com>
CC: <netdev@...r.kernel.org>, <rickywu0421@...il.com>,
	<linux-kernel@...r.kernel.org>, <edumazet@...gle.com>,
	<intel-wired-lan@...ts.osuosl.org>, <kuba@...nel.org>,
	<anthony.l.nguyen@...el.com>, <pabeni@...hat.com>, <davem@...emloft.net>,
	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>, "Lifshits, Vitaly"
	<vitaly.lifshits@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH v2 1/2] e1000e: let the sleep codes run
 every time

On 06/05/2024 19:46, En-Wei WU wrote:
> Thank you for your time.
> 
> Originally, sleep codes would only be executed if the first read fails
> or the link status that is read is down. Some circumstances like the
> [v2,2/2] "e1000e: fix link fluctuations problem" would need a delay
> before first reading/accessing the PHY IEEE register, so that it won't
> read the instability of the link status bit in the PHY status
> register.
> 
> I've realized that this approach isn't good enough since the purpose
> is only to fix the problem in another patch and it also changes the
> behavior.
> 
> Here is the modification of the patch [v2,2/2] "e1000e: fix link
> fluctuations problem":
> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
> @@ -1428,7 +1428,17 @@  static s32
> e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
> - ret_val = e1000e_phy_has_link_generic(hw, 1, 0, &link);
> /* comments */
> + ret_val = e1000e_phy_has_link_generic(hw, COPPER_LINK_UP_LIMIT,
> 100000, &link);
> 
> Do you think we can just add a msleep/usleep_range in front of the
> e1000e_phy_has_link_generic() instead of modifying the sleep codes in
> e1000e_phy_has_link_generic()?
> 
> Thanks.
> 
> On Mon, 6 May 2024 at 23:53, Sasha Neftin <sasha.neftin@...el.com> wrote:
>>
>> On 03/05/2024 13:18, Ricky Wu wrote:
>>> Originally, the sleep codes being moved forward only
>>> ran if we met some conditions (e.g. BMSR_LSTATUS bit
>>> not set in phy_status). Moving these sleep codes forward
>>> makes the usec_interval take effect every time.
>>>
>>> Signed-off-by: Ricky Wu <en-wei.wu@...onical.com>
>>> ---
>>>
>>> In v2:
>>> * Split the sleep codes into this patch
>>>
>>>    drivers/net/ethernet/intel/e1000e/phy.c | 9 +++++----
>>>    1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/phy.c b/drivers/net/ethernet/intel/e1000e/phy.c
>>> index 93544f1cc2a5..4a58d56679c9 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/phy.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/phy.c
>>> @@ -1777,6 +1777,11 @@ s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, u32 iterations,
>>>
>>>        *success = false;
>>>        for (i = 0; i < iterations; i++) {
>>> +             if (usec_interval >= 1000)
>>> +                     msleep(usec_interval / 1000);
>>> +             else
>>> +                     udelay(usec_interval);
>>> +
>>
>> I do not understand this approach. Why wait before first
>> reading/accessing the PHY IEEE register?
>>
>> For further discussion, I would like to introduce Dima Ruinskiy (architect)
>>
>>>                /* Some PHYs require the MII_BMSR register to be read
>>>                 * twice due to the link bit being sticky.  No harm doing
>>>                 * it across the board.
>>> @@ -1799,10 +1804,6 @@ s32 e1000e_phy_has_link_generic(struct e1000_hw *hw, u32 iterations,
>>>                        *success = true;
>>>                        break;
>>>                }
>>> -             if (usec_interval >= 1000)
>>> -                     msleep(usec_interval / 1000);
>>> -             else
>>> -                     udelay(usec_interval);
>>>        }
>>>
>>>        return ret_val;
>>

Regarding the usage of sleep/sleep_range functions - they can only be 
used if this code will never be called from an atomic context. I do not 
know if such a guarantee exists.

In general I have quite a few questions and concerns regarding this 
patch series. The comment in patch 2/2 states that it is designed to 
work around a link flap issue with the average time between link up and 
down is 3-4ms, and yet the code waits a whole 100ms before reading the 
PHY bit the first time. Why so long?

Furthermore, if I am reading this right, it appears that, with the 
proposed change, e1000e_phy_has_link_generic will poll the PHY link up 
to 10 times, with 100ms delay between each iteration - until the link is 
up. Won't it lead to wasting all this time, if the link is actually down?

Looking at https://bugzilla.kernel.org/show_bug.cgi?id=218642, at the 
problem this commit series is trying to solve - I wonder:
(1) How serious this problem is. It is normal for link establishment to 
take a few seconds from plugging the cable (due to PHY 
auto-negotiation), and I can accept some link instability during that time.
(2) Assuming the problem is considered serious - have we root-caused it 
correctly.