[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241029034657.6937-1-jinjian.song@fibocom.com>
Date: Tue, 29 Oct 2024 11:46:57 +0800
From: Jinjian Song <jinjian.song@...ocom.com>
To: ryazanov.s.a@...il.com,
	Jinjian Song <jinjian.song@...ocom.com>,
	chandrashekar.devegowda@...el.com,
	chiranjeevi.rapolu@...ux.intel.com,
	haijun.liu@...iatek.com,
	m.chetan.kumar@...ux.intel.com,
	ricardo.martinez@...ux.intel.com,
	loic.poulain@...aro.org,
	johannes@...solutions.net,
	davem@...emloft.net,
	edumazet@...gle.com,
	kuba@...nel.org,
	pabeni@...hat.com
Cc: angelogioacchino.delregno@...labora.com,
	corbet@....net,
	danielwinkler@...gle.com,
	helgaas@...nel.org,
	korneld@...gle.com,
	linux-arm-kernel@...ts.infradead.org,
	linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	linux-mediatek@...ts.infradead.org,
	matthias.bgg@...il.com,
	netdev@...r.kernel.org
Subject: Re: [net-next v2] net: wwan: t7xx: reset device if suspend fails
From: Sergey Ryazanov <ryazanov.s.a@...il.com>
>Hello Jinjian,
>
>On 22.10.2024 11:43, Jinjian Song wrote:
>> If driver fails to set the device to suspend, it means that the
>> device is abnormal. In this case, reset the device to recover
>> when PCIe device is offline.
>
>Is it a reproducible or a speculative issue? Does the fix recover modem 
>from a problematic state?
>
>Anyway we need someone more familiar with this hardware (Intel or 
>MediaTek engineer) to Ack the change to make sure we are not going to 
>put a system in a more complicated state.
Hi Sergey,
This is a very difficult issue to replicate onece occured and fixed.
The issue occured when driver and device lost the connection. I have
encountered this problem twice so far:
1. During suspend/resume stress test, there was a probabilistic D3L2
time sequence issue with the BIOS, result in PCIe link down, driver
read and write the register of device invalid, so suspend failed.
This issue was eventually fixed in the BIOS and I was able to restore
it through the reset module after reproducing the problem.
2. During idle test, the modem probabilistic hang up, result in PCIe
link down, driver read and write the register of device invalid, so
suspend failed. This issue was eventually fiex in device modem firmware
by adjust a certain power supply voltage, and reset modem as a workround
to restore when the MBIM port command timeout in userspace applycations.
Hardware reset modem to recover was discussed with MTK, and they said
that if we don't want to keep the on-site problem location in case of
suspend failure, we can use the recover solution. 
Both the ocurred issues result in the PCIe link issue, driver can't 
read and writer the register of WWAN device, so I want to add this path
to restore, hardware reset modem can recover modem, but using the 
pci_channle_offline() as the judgment is my inference.
Thanks.
>> Signed-off-by: Jinjian Song <jinjian.song@...ocom.com>
>> ---
>> V2:
>>   * Add judgment, reset when device is offline
>> ---
>>   drivers/net/wwan/t7xx/t7xx_pci.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>> 
>> diff --git a/drivers/net/wwan/t7xx/t7xx_pci.c b/drivers/net/wwan/t7xx/t7xx_pci.c
>> index e556e5bd49ab..4f89a353588b 100644
>> --- a/drivers/net/wwan/t7xx/t7xx_pci.c
>> +++ b/drivers/net/wwan/t7xx/t7xx_pci.c
>> @@ -427,6 +427,10 @@ static int __t7xx_pci_pm_suspend(struct pci_dev *pdev)
>>   	iowrite32(T7XX_L1_BIT(0), IREG_BASE(t7xx_dev) + ENABLE_ASPM_LOWPWR);
>>   	atomic_set(&t7xx_dev->md_pm_state, MTK_PM_RESUMED);
>>   	t7xx_pcie_mac_set_int(t7xx_dev, SAP_RGU_INT);
>> +	if (pci_channel_offline(pdev)) {
>> +		dev_err(&pdev->dev, "Device offline, reset to recover\n");
>> +		t7xx_reset_device(t7xx_dev, PLDR);
>> +	}
>>   	return ret;
>>   }
>
>--
>Sergey
>
Best Regards,
Jinjian.
Powered by blists - more mailing lists
 
