lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 20 Nov 2018 23:29:42 +0100
From:   Heiner Kallweit <hkallweit1@...il.com>
To:     Paul Menzel <pmenzel@...gen.mpg.de>, Andrew Lunn <andrew@...n.ch>
Cc:     Realtek linux nic maintainers <nic_swsd@...ltek.com>,
        "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: Realtek NIC uses over 1 Watt with no traffic

On 20.11.2018 23:25, Paul Menzel wrote:
> Dear Heiner,
> 
> 
> Am 20.11.18 um 22:06 schrieb Heiner Kallweit:
>> On 20.11.2018 21:31, Paul Menzel wrote:
> 
> […]
> 
>>> Am 20.11.18 um 21:14 schrieb Heiner Kallweit:
>>>> On 20.11.2018 15:45, Andrew Lunn wrote:
>>>>> On Tue, Nov 20, 2018 at 09:40:25AM +0100, Paul Menzel wrote:
>>>
>>>>>> Using Ubuntu 18.10, Linux 4.18.0-11-generic, PowerTOP 2.9 shows, the NIC
>>>>>> uses 1.77 Watts. A network cable is plugged in, but there is no real traffic
>>>>>> according to `iftop`. Only an email program is running.
>>>>>>
>>>>>>       $ lspci -nn -s 3:00.1
>>>>>>       03:00.1 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
>>>>>> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev
>>>>>> 12)
>>>>>>
>>>>>> Is that a measurement error, or does the NIC really need that much power?
>>>
>>>>> This sounds like Energy Efficient Ethernet, EEE, is not enabled.
>>>>>
>>>>> What does ethtool --show-eee ethX say?
>>>
>>>      $ sudo ethtool --show-eee enp3s0f1
>>>      Cannot get EEE settings: Operation not supported
>>>
>>>> The r8169 driver doesn't support the get_eee ethtool_ops callback.
>>>> For certain chip versions EEE gets enabled in the PHY init, for others
>>>> not and some don't seem to support EEE at all.
>>>>
>>>> Apart from EEE one important factor affecting power consumption is ASPM.
>>>> This was recently enabled for certain chip versions.
>>>>
>>>> Information that would help:
>>>>
>>>> whether Wake-on-LAN is enabled ("Wake-on:" line from ethtool output)
>>>
>>> ```
>>> $ sudo ethtool enp3s0f1
>>> Settings for enp3s0f1:
>>>      Supported ports: [ TP AUI BNC MII FIBRE ]
>>>      Supported link modes:   10baseT/Half 10baseT/Full
>>>                              100baseT/Half 100baseT/Full
>>>                              1000baseT/Full
>>>      Supported pause frame use: Symmetric Receive-only
>>>      Supports auto-negotiation: Yes
>>>      Supported FEC modes: Not reported
>>>      Advertised link modes:  10baseT/Half 10baseT/Full
>>>                              100baseT/Half 100baseT/Full
>>>                              1000baseT/Full
>>>      Advertised pause frame use: Symmetric Receive-only
>>>      Advertised auto-negotiation: Yes
>>>      Advertised FEC modes: Not reported
>>>      Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>>                                           100baseT/Half 100baseT/Full
>>>                                           1000baseT/Full
>>>      Link partner advertised pause frame use: Symmetric
>>>      Link partner advertised auto-negotiation: Yes
>>>      Link partner advertised FEC modes: Not reported
>>>      Speed: 1000Mb/s
>>>      Duplex: Full
>>>      Port: MII
>>>      PHYAD: 0
>>>      Transceiver: internal
>>>      Auto-negotiation: on
>>>      Supports Wake-on: pumbg
>>>      Wake-on: g
>>>      Current message level: 0x00000033 (51)
>>>                     drv probe ifdown ifup
>>>      Link detected: yes
>>> ```
>>>
>>> So, it’s enabled (g  Wake on MagicPacket(tm)).
>>>
>>> Running `sudo ethtool -s enp3s0f1 wol d;` doesn’t change anything though.
>>>
>>>> lspci -vv output for the Realtek NIC
>>>
>>> Here is the output (quoted, so that Thunderbird does not wrap the line).
>>>
>>>> $ sudo lspci -vv -s 3:00.1
>>>> 03:00.1 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 12)
>>>>      Subsystem: CLEVO/KAPOK Computer RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
>>>>      Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
>>>>      Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>      Latency: 0, Cache Line Size: 64 bytes
>>>>      Interrupt: pin A routed to IRQ 19
>>>>      Region 0: I/O ports at e000 [size=256]
>>>>      Region 2: Memory at df114000 (64-bit, non-prefetchable) [size=4K]
>>>>      Region 4: Memory at df110000 (64-bit, non-prefetchable) [size=16K]
>>>>      Capabilities: [40] Power Management version 3
>>>>          Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
>>>>          Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>>>>      Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
>>>>          Address: 0000000000000000  Data: 0000
>>>>      Capabilities: [70] Express (v2) Endpoint, MSI 01
>>>>          DevCap:    MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
>>>>              ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 10.000W
>>>>          DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>>>              RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
>>>>              MaxPayload 128 bytes, MaxReadReq 4096 bytes
>>>>          DevSta:    CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
>>>>          LnkCap:    Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
>>>>              ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>>>          LnkCtl:    ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
>>
>> L0s is missing here, no idea why.
> 
> Indeed. I’ll forward that to TUXEDO.
> 
>>>>              ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
>>>>          LnkSta:    Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>>>          DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via message/WAKE#
>>>>          DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>>>>          LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
>>>>               EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
>>>>      Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
>>>>          Vector table: BAR=4 offset=00000000
>>>>          PBA: BAR=4 offset=00000800
>>>>      Capabilities: [d0] Vital Product Data
>>>> pcilib: sysfs_read_vpd: read failed: Input/output error
>>>>          Not readable
>>>>      Capabilities: [100 v2] Advanced Error Reporting
>>>>          UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>          UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>>>>          UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>>>>          CESta:    RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr+
>>>>          CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>>>          AERCap:    First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
>>>>      Capabilities: [160 v1] Device Serial Number 01-00-00-00-68-4c-e0-00
>>>>      Capabilities: [170 v1] Latency Tolerance Reporting
>>>>          Max snoop latency: 3145728ns
>>>>          Max no snoop latency: 3145728ns
>>>>      Capabilities: [178 v1] L1 PM Substates
>>>>          L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>>>>                PortCommonModeRestoreTime=150us PortTPowerOnTime=150us
>>>>          L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
>>>>                 T_CommonMode=0us LTR1.2_Threshold=0ns
>>>>          L1SubCtl2: T_PwrOn=10us
>>>>      Kernel driver in use: r8169
>>>>      Kernel modules: r8169
>>>
>>> Some Active State Power Management levels seem to be enabled.
>>>
>>>> Info from powertop about package C states. With ASPM my system reaches
>>>> 50% PC7 + 50% PC10.
>>>
>>> That seems to be the case on my TUXEDO Book BU1406 too.
>>>
>>>>            Paket     |             Kern    |            CPU 0       CPU 2
>>>>                      |                     | C0 aktiv    1,7%        1,1%
>>>>                      |                     | POLL        0,0%    0,0 ms  0,0%    0,0 ms
>>>>                      |                     | C1E         0,2%    0,8 ms  0,1%    0,2 ms
>>>> C2 (pc2)    5,2%    |                     |
>>>> C3 (pc3)   82,1%    | C3 (cc3)    0,0%    | C3          0,0%    0,2 ms  0,1%    0,2 ms
>>
>> Relevant are the package states and your system reaches pc3 only. The "Tunables" section
>> in powertop may provide hints how to save more power.
> 
> Thank you for the hint. As it’s unrelated, I’ll just paste the tunables below, but will try to forward it to the correct people.
> 
>     Schlecht      Audiocodec-Energieverwaltung einschalten
>     Schlecht      VM-Rückschreibezeitlimit
> 
>>>> C6 (pc6)    0,0%    | C6 (cc6)    1,3%    | C6          0,8%    0,5 ms  1,4%    0,6 ms
>>>> C7 (pc7)    0,0%    | C7 (cc7)   90,8%    | C7s         0,0%    1,6 ms  0,0%    0,0 ms
>>>> C8 (pc8)    0,0%    |                     | C8          6,0%    1,8 ms 10,1%    2,0 ms
>>>> C9 (pc9)    0,0%    |                     | C9          0,2%    2,8 ms  0,2%    2,9 ms
>>>> C10 (pc10)  0,0%    |                     | C10        88,7%   12,7 ms 84,4%   14,9 ms
>>>>
>>>>                      |             Kern    |            CPU 1       CPU 3
>>>>                      |                     | C0 aktiv    1,0%        0,8%
>>>>                      |                     | POLL        0,0%    0,0 ms  0,0%    0,0 ms
>>>>                      |                     | C1E         0,1%    0,3 ms  0,1%    0,3 ms
>>>>                      |                     |
>>>>                      | C3 (cc3)    0,0%    | C3          0,0%    0,2 ms  0,0%    0,2 ms
>>>>                      | C6 (cc6)    1,1%    | C6          0,9%    0,6 ms  0,8%    0,5 ms
>>>>                      | C7 (cc7)   92,2%    | C7s         0,0%    1,7 ms  0,0%    0,0 ms
>>>>                      |                     | C8          6,2%    1,7 ms  5,4%    1,7 ms
>>>>                      |                     | C9          0,3%    1,7 ms  0,1%    1,9 ms
>>>>                      |                     | C10        88,8%   12,1 ms 90,7%   14,8 ms
>>>>
>>>>                      |             GPU     |
>>>>                      |                     |
>>>>                      | Powered On  2,2%    |
>>>>                      | RC6        97,8%    |
>>>>                      | RC6p        0,0%    |
>>>>                      | RC6pp       0,0%    |
>>>
>>>> dmesg output filtered for "r8169". Primarily relevant is the line with
>>>> the chip name and XID.
>>>
>>> Please find them below.
>>>
>>>> $ sudo dmesg | grep r8169
>>>> [    5.318442] calling  rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 418
>>>> [    5.318470] r8169 0000:03:00.1: enabling device (0000 -> 0003)
>>>> [    5.340324] libphy: r8169: probed
>>>> [    5.340630] r8169 0000:03:00.1 eth0: RTL8411, 80:fa:5b:3b:dd:f0, XID 5c800800, IRQ 136
>>
>> Good to know. For this chip version rtl8168g_2_hw_phy_config() is used to configure the PHY,
>> but this function just loads the firmware. So we don't know whether EEE is enabled.
>>
>> What you could do to test further is limiting the speed to 100MBit or 10MBit via ethtool.
>> If this reduces power consumption significantly it's a hint that indeed the PHY seems
>> to be the one to be blamed.
> 
> With `sudo ethtool -s enp3s0f1 speed 10 duplex full` the power usage drops to 800 mW and even to 0, so it’s much less as with 1 Gbit/s.
> 
OK, so Andrew was right and the issue seems to be the disabled EEE.
I'll set this on my agenda. Most likely in step 1 you'll have to use
ethtool to switch on EEE, in step 2 EEE will be enabled per default
for this chip version.

>>>> [    5.340632] r8169 0000:03:00.1 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>>>> [    5.340673] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 9217 usecs
>>>> [    5.799967] r8169 0000:03:00.1 enp3s0f1: renamed from eth0
>>>> [   10.036968] Generic PHY r8169-301:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-301:00, irq=IGNORE)
>>>> [  676.940934] calling  rtl8169_pci_driver_init+0x0/0x1000 [r8169] @ 22235
>>>> [  676.952411] libphy: r8169: probed
>>>> [  676.952701] r8169 0000:03:00.1 eth0: RTL8411, 80:fa:5b:3b:dd:f0, XID 5c800800, IRQ 139
>>>> [  676.952702] r8169 0000:03:00.1 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
>>>> [  676.952736] initcall rtl8169_pci_driver_init+0x0/0x1000 [r8169] returned 0 after 11518 usecs
>>>> [  676.954420] r8169 0000:03:00.1 enp3s0f1: renamed from eth0
>>>> [  676.975161] Generic PHY r8169-301:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-301:00, irq=IGNORE)
>>>> [  680.518923] r8169 0000:03:00.1 enp3s0f1: Link is Up - 1Gbps/Full - flow control rx/tx
>>>> [ 1751.285899] r8169 0000:03:00.1: invalid short VPD tag 00 at offset 1
> 
> 
> Kind regards,
> 
> Paul
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ