lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Fri, 17 May 2024 12:07:36 +0200
From: Paul Menzel <pmenzel@...gen.mpg.de>
To: Hui Wang <hui.wang@...onical.com>
Cc: intel-wired-lan@...ts.osuosl.org, netdev@...r.kernel.org,
 kuba@...nel.org, anthony.l.nguyen@...el.com, vitaly.lifshits@...el.com,
 dima.ruinskiy@...el.com, davem@...emloft.net, pabeni@...hat.com,
 edumazet@...gle.com, sasha.neftin@...el.com, naamax.meir@...ux.intel.com,
 Jacob Keller <jacob.e.keller@...el.com>, Oliver Sang
 <oliver.sang@...el.com>, Zhang Rui <rui.zhang@...el.com>,
 regressions@...ts.linux.dev
Subject: Re: [Intel-wired-lan] [PATCH v2] e1000e: move force SMBUS near the
 end of enable_ulp function

Dear Hui,


Thank you for your response.


Am 17.05.24 um 11:45 schrieb Hui Wang:
> 
> On 5/17/24 13:45, Paul Menzel wrote:

>> Am 08.05.24 um 14:06 schrieb Hui Wang:
>>> The commit 861e8086029e ("e1000e: move force SMBUS from enable ulp
>>> function to avoid PHY loss issue") introduces a regression on
>>> CH_MTP_I219_LM18 (PCIID: 0x8086550A). Without the referred commit, the
>>
>> *P*CH
>>
>>> ethernet works well after suspend and resume, but after applying the
>>> commit, the ethernet couldn't work anymore after the resume and the
>>> dmesg shows that the NIC Link changes to 10Mbps (1000Mbps originally):
>>
>> 1.  s/Link/link/
>> 2.  “couldn’t work” means the reduced bandwidth?
> 
> On my side, once the link changes to 10Mbps, I couldn't ping the machine 
> anymore. And as you said, it probably has sth to do with switch/router 
> configuration.
> 
>> 3. Please add a blank line and maybe indent the past with four spaces.
>>
>>> [   43.305084] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx
>>>
>>> Without the commit, the force SMBUS code will not be executed if
>>> "return 0" or "goto out" is executed in the enable_ulp(), and in my
>>> case, the "goto out" is executed since FWSM_FW_VALID is set. But after
>>> applying the commit, the force SMBUS code will be ran unconditionally.
>>>
>>> Here move the force SMBUS code back to enable_ulp() and put it
>>> immediate ahead of hw->phy.ops.release(hw), this could allow the
>>
>> immediate*l*?

Sorry, I meant immediate*ly*.

>>> longest settling time as possible for interface in this function and
>>> doesn't change the original code logic.
>>
>> Re-ordering code to achieve some waiting time sounds like, it’s not 
>> 100 % sure, that the problem won’t occur again?
> 
> Actually this patch not only adds the waiting time, but also restore the 
> original code logic:
> 
>   original: On a machine with the CSME, the SMBUS will not be forced, 
> accordingly the SMBUS will not be unforced after resume.
> 
>   wrong: On a machine with the CSME, the SMBUS is forced, but the SMBUS 
> is not unforced after resume, there is an unbalance. My patch could fix 
> this case.

Thank you for elaborating. In my opinion, then two commits would be 
better. One revert with a description of the problem and documentation 
of the test systems. Then the second patch with the fix.

>> Could you please document your test system?
> Lenovo Thinkpad P16Gen2 with ethernet card:
> 
> 00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a] (rev 20)
> 
>> Just a side note: Booting Linux 6.9-rc5+ *with kexec* on Supermicro 
>> Super Server/X13SAE, BIOS 2.0 10/17/2022 with the network device 
>> below, it also came up only with 10 Mbps and Ethernet did not work, 
>> for example `ping`. I conjectured though, that the non-working part 
>> was due to the switch configuration not allowing 10 Mbps.
>>
>>     00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (17) I219-LM [8086:1a1c] (rev 11)
>>
> My test and result are same as yours.
> 
> Thanks.

Thank you for the confirmation.

>> I didn’t find the time to further analyze and report the issue.
>>
>> Also could this also be related to the regression reported by the 
>> kernel test robot [1]?
>>
>>     00:19.0 Ethernet controller: Intel Corporation Ethernet Connection (3) I218-V (rev 03)
>>
>>> Fixes: 861e8086029e ("e1000e: move force SMBUS from enable ulp 
>>> function to avoid PHY loss issue")
>>> Signed-off-by: Hui Wang <hui.wang@...onical.com>
>>> Acked-by: Vitaly Lifshits <vitaly.lifshits@...el.com>
>>> Tested-by: Naama Meir <naamax.meir@...ux.intel.com>
>>> Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
>>> ---
>>> In the v2:
>>>   Change "this commit" to "the referred commit" in the commit header
>>>   Fix a potential infinite loop if ret_val is not zero
>>>   drivers/net/ethernet/intel/e1000e/ich8lan.c | 22 +++++++++++++++++++++
>>>   drivers/net/ethernet/intel/e1000e/netdev.c  | 18 -----------------
>>>   2 files changed, 22 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c 
>>> b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>> index f9e94be36e97..2e98a2a0bead 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
>>> @@ -1225,6 +1225,28 @@ s32 e1000_enable_ulp_lpt_lp(struct e1000_hw 
>>> *hw, bool to_sx)
>>>       }
>>>     release:
>>> +    /* Switching PHY interface always returns MDI error
>>> +     * so disable retry mechanism to avoid wasting time
>>> +     */
>>> +    e1000e_disable_phy_retry(hw);
>>> +
>>> +    /* Force SMBus mode in PHY */
>>> +    ret_val = e1000_read_phy_reg_hv_locked(hw, CV_SMB_CTRL, &phy_reg);
>>> +    if (ret_val) {
>>> +        e1000e_enable_phy_retry(hw);
>>> +        hw->phy.ops.release(hw);
>>> +        goto out;
>>> +    }
>>> +    phy_reg |= CV_SMB_CTRL_FORCE_SMBUS;
>>> +    e1000_write_phy_reg_hv_locked(hw, CV_SMB_CTRL, phy_reg);
>>> +
>>> +    e1000e_enable_phy_retry(hw);
>>> +
>>> +    /* Force SMBus mode in MAC */
>>> +    mac_reg = er32(CTRL_EXT);
>>> +    mac_reg |= E1000_CTRL_EXT_FORCE_SMBUS;
>>> +    ew32(CTRL_EXT, mac_reg);
>>> +
>>>       hw->phy.ops.release(hw);
>>>   out:
>>>       if (ret_val)
>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> index 3692fce20195..cc8c531ec3df 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> @@ -6623,7 +6623,6 @@ static int __e1000_shutdown(struct pci_dev 
>>> *pdev, bool runtime)
>>>       struct e1000_hw *hw = &adapter->hw;
>>>       u32 ctrl, ctrl_ext, rctl, status, wufc;
>>>       int retval = 0;
>>> -    u16 smb_ctrl;
>>>         /* Runtime suspend should only enable wakeup for link changes */
>>>       if (runtime)
>>> @@ -6697,23 +6696,6 @@ static int __e1000_shutdown(struct pci_dev 
>>> *pdev, bool runtime)
>>>               if (retval)
>>>                   return retval;
>>>           }
>>> -
>>> -        /* Force SMBUS to allow WOL */
>>> -        /* Switching PHY interface always returns MDI error
>>> -         * so disable retry mechanism to avoid wasting time
>>> -         */
>>> -        e1000e_disable_phy_retry(hw);
>>> -
>>> -        e1e_rphy(hw, CV_SMB_CTRL, &smb_ctrl);
>>> -        smb_ctrl |= CV_SMB_CTRL_FORCE_SMBUS;
>>> -        e1e_wphy(hw, CV_SMB_CTRL, smb_ctrl);
>>> -
>>> -        e1000e_enable_phy_retry(hw);
>>> -
>>> -        /* Force SMBus mode in MAC */
>>> -        ctrl_ext = er32(CTRL_EXT);
>>> -        ctrl_ext |= E1000_CTRL_EXT_FORCE_SMBUS;
>>> -        ew32(CTRL_EXT, ctrl_ext);
>>>       }
>>>         /* Ensure that the appropriate bits are set in LPI_CTRL


Kind regards,

Paul


>> [1]: https://lore.kernel.org/intel-wired-lan/202405150942.f9b873b1-oliver.sang@intel.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ