lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d4e9e118-b4c0-4917-b9f0-39ac52229d30@gmail.com>
Date: Mon, 13 May 2024 01:27:24 +0100
From: Ken Milmore <ken.milmore@...il.com>
To: Heiner Kallweit <hkallweit1@...il.com>, netdev@...r.kernel.org
Cc: nic_swsd@...ltek.com
Subject: Re: r8169: transmit queue timeouts and IRQ masking

On 12/05/2024 23:08, Heiner Kallweit wrote:
> On 12.05.2024 21:49, Ken Milmore wrote:
>>
>> I had started out with the assumption that an interrupt acknowledgement coinciding with some part of the work being done in rtl8169_poll() might be the cause of the problem.
>> So it seemed natural to try guarding the whole block by disabling interrupts at the beginning.
>> But this seems to work just as well:
>>
>> diff --git linux-source-6.1~/drivers/net/ethernet/realtek/r8169_main.c linux-source-6.1/drivers/net/ethernet/realtek/r8169_main.c
>> index 6e34177..353ce99 100644
>> --- linux-source-6.1~/drivers/net/ethernet/realtek/r8169_main.c
>> +++ linux-source-6.1/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -4659,8 +4659,10 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
>>  
>>  	work_done = rtl_rx(dev, tp, budget);
>>  
>> -	if (work_done < budget && napi_complete_done(napi, work_done))
>> +	if (work_done < budget && napi_complete_done(napi, work_done)) {
>> +		rtl_irq_disable(tp);
>>  		rtl_irq_enable(tp);
>> +	}
>>  
>>  	return work_done;
>>  }
>>
>> On this basis, I assume the problem may actually involve some subtlety with the behaviour of the interrupt mask and status registers.
>>
> In the register dump in your original report the interrupt mask is set.
> So it seems rtl_irq_enable() was executed. I don't have an explanation
> why a previous rtl_irq_disable() makes a difference.
> Interesting would be whether it has to be a write to the interrupt mask
> register, or whether a write to any register is sufficient.
> 

In place of calling rtl_irq_disable(), I tried poking at the doorbell and at some of the unused timer registers. These had no effect.

I tried writing various different values to the mask register:

RTL_W32(tp, IntrMask_8125, 0x00); // worked, naturally
RTL_W32(tp, IntrMask_8125, 0x3f); // no effect
RTL_W32(tp, IntrMask_8125, 0x3b); // no effect
RTL_W32(tp, IntrMask_8125, 0x3a); // worked!

So masking both TxOK and RxOK before unmasking seemed to work, but masking either of them individually didn't.

Also, masking just TxOK then just RxOK in sequence, or vice versa didn't seem to work; they both had to be masked together.

YMMV! :-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ