[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7593559-060e-e8e6-5746-6dc63c487027@huawei.com>
Date: Tue, 22 Nov 2022 11:42:09 +0800
From: "lihuisong (C)" <lihuisong@...wei.com>
To: Robbie King <robbiek@...ghtlabs.com>,
Sudeep Holla <sudeep.holla@....com>
CC: <linux-acpi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<rafael@...nel.org>, <rafael.j.wysocki@...el.com>,
<wanghuiqiang@...wei.com>, <huangdaode@...wei.com>,
<tanxiaofei@...wei.com>
Subject: Re: [RFC] ACPI: PCC: Support shared interrupt for multiple subspaces
在 2022/11/22 0:59, Robbie King 写道:
> On 11/19/2022 2:32 AM, lihuisong (C) wrote:
>> 在 2022/11/18 2:09, Robbie King 写道:
>>> On 11/7/2022 1:24 AM, lihuisong (C) wrote:
>>>> 在 2022/11/4 23:39, Robbie King 写道:
>>>>> On 11/4/2022 11:15 AM, Sudeep Holla wrote:
>>>>>> On Fri, Nov 04, 2022 at 11:04:22AM -0400, Robbie King wrote:
>>>>>>> Hello Huisong, your raising of the shared interrupt issue is very timely, I
>>>>>>> am working to implement "Extended PCC subspaces (types 3 and 4)" using PCC
>>>>>>> on the ARM RDN2 reference platform as a proof of concept, and encountered
>>>>>>> this issue as well. FWIW, I am currently testing using Sudeep's patch with
>>>>>>> the "chan_in_use" flag removed, and so far have not encountered any issues.
>>>>>>>
>>>>>> Interesting, do you mean the patch I post in this thread but without the
>>>>>> whole chan_in_use flag ?
>>>>> That's right, diff I'm running with is attached to end of message.
>>>> Hello Robbie, In multiple subspaces scenario, there is a problem
>>>> that OS doesn't know which channel should respond to the interrupt
>>>> if no this chan_in_use flag. If you have not not encountered any
>>>> issues in this case, it may be related to your register settings.
>>>>
>>> Hi Huisong, apologies, I see your point now concerning multiple subspaces.
>>>
>>> I have started stress testing where I continuously generate both requests
>>> and notifications as quickly as possible, and unfortunately found an issue
>>> even with the original chan_in_use patch. I first had to modify the patch
>>> to get the type 4 channel notifications to function at all, essentially
>>> ignoring the chan_in_use flag for that channel. With that change, I still
>>> hit my original stress issue, where the pcc_mbox_irq function did not
>>> correctly ignore an interrupt for the type 3 channel.
>>>
>>> The issue occurs when a request from AP to SCP over the type 3 channel is
>>> outstanding, and simultaneously the SCP initiates a notification over the
>>> type 4 channel. Since the two channels share an interrupt, both handlers
>>> are invoked.
>>>
>>> I've tried to draw out the state of the channel status "free" bits along
>>> with the AP and SCP function calls involved.
>>>
>>> type 3
>>> ------
>>>
>>> (1)pcc.c:pcc_send_data()
>>> | (5) mailbox.c:mbox_chan_receive_data()
>>> _______v (4)pcc.c:pcc_mbox_irq()
>>> free \_________________________________________
>>>
>>> ^
>>> type 4 ^
>>> ------ ^
>>> _____________________
>>> free \_____________________________
>>> ^ ^
>>> | |
>>> (2)mod_smt.c:smt_transmit() |
>>> |
>>> (3)mod_mhu2.c:raise_interrupt()
>>>
>>> The sequence of events are:
>>>
>>> 1) OS initiates request to SCP by clearing FREE in status and ringing SCP doorbell
>>> 2) SCP initiates notification by filling shared memory and clearing FREE in status
>>> 3) SCP notifies OS by ringing OS doorbell
>>> 4) OS first invokes interrupt handler for type 3 channel
>>>
>>> At this step, the issue is that "val" from reading status (i.e. CommandCompleteCheck)
>>> is zero (SCP has not responded yet) so the code below falls through and continues
>>> to processes the interrupt as if the request has been acknowledged by the SCP.
>>>
>>> if (val) { /* Ensure GAS exists and value is non-zero */
>>> val &= pchan->cmd_complete.status_mask;
>>> if (!val)
>>> return IRQ_NONE;
>>> }
>>>
>>> The chan_in_use flag does not address this because the channel is indeed in use.
>>>
>>> 5) ACPI:PCC client kernel module is incorrectly notified that response data is
>>> available
>> Indeed, chan_in_use flag is invalid for type4.
> Thinking about this some more, I believe there is a need for the chan_in_use flag
> for the type 4 channels. If there are multiple SCP to AP channels sharing an
> interrupt, and the PCC client code chooses to acknowledge the transfer from
> process level (i.e. call mbox_send outside of the mbox_chan_received_data callback),
> then I believe a window exists where the callback could be invoked twice for the
> same SCP to AP channel. I've attached a diff.
I don't understand your concern. type4 need to set command complete bit and
ring doorbell after processing mbox_chan_received_data callback. I think it
is ok without chan_in_use flag.
Here's what I think:
For type4, set the command complete bit to 1 by default in platform.
Clear the command complete when do platform notification.
If a given channel detects the command complete bit is 0, it should respond
the interrupt, otherwise there is nothing to do.
I put all points we discussed into the RFC V2 patch. Let's go to V2 to
discuss.
>
>>> I added the following fix (applied on top of Sudeep's original patch for clarity)
>>> for the issue above which solved the stress test issue. I've changed the interrupt
>>> handler to explicitly verify that the status value matches the mask for type 3
>>> interrupts before acknowledging them. Conversely, a type 4 channel verifies that
>>> the status value does *not* match the mask, since in this case we are functioning
>>> as the recipient, not the initiator.
>>>
>>> One concern is that since this fundamentally changes handling of the channel status,
>>> that existing platforms could be impacted.
>> [snip]
>>> + /*
>>> + * When we control data flow on the channel, we expect
>>> + * to see the mask bit(s) set by the remote to indicate
>>> + * the presence of a valid response. When we do not
>>> + * control the flow (i.e. type 4) the opposite is true.
>>> + */
>>> + if (pchan->is_controller)
>>> + cmp = pchan->cmd_complete.status_mask;
>>> + else
>>> + cmp = 0;
>>> +
>>> + val &= pchan->cmd_complete.status_mask;
>>> + if (cmp != val)
>>> + return IRQ_NONE;
>>>
>> We don't have to use the pchan->cmd_complete.status_mask as above.
>>
>> For the communication from AP to SCP, if this channel is in use, command
>> complete bit is 1 indicates that the command being executed has been
>> completed.
>> For the communication from SCP to AP, if this channel is in use, command
>> complete bit is 0 indicates that the bit has been cleared and OS should
>> response the interrupt.
>>
>> So you concern should be gone if we do as following approach:
>> "
>> val &= pchan->cmd_complete.status_mask;
>> need_rsp_irq = pchan->is_controller ? val != 0 : val == 0;
>> if (!need_rsp_irq)
>> return IRQ_NONE
>> "
>>
>> This may depend on the default value of the command complete register
>> for each channel(must be 1, means that the channel is free for use).
>> It is ok for type3 because of chan_in_use flag, while something needs
>> to do in platform or OS for type4.
>>> ret = pcc_chan_reg_read(&pchan->error, &val);
>>> if (ret)
>>> @@ -704,6 +717,9 @@ static int pcc_mbox_probe(struct platform_device *pdev)
>>> pcc_mbox_channels[i].con_priv = pchan;
>>> pchan->chan.mchan = &pcc_mbox_channels[i];
>>>
>>> + pchan->is_controller =
>>> + (pcct_entry->type != ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE);
>>> +
>> This definition does not apply to all types because type1 and type2
>> support bidirectional communication.
>>
>>> if (pcct_entry->type == ACPI_PCCT_TYPE_EXT_PCC_SLAVE_SUBSPACE &&
>>> !pcc_mbox_ctrl->txdone_irq) {
>>> pr_err("Plaform Interrupt flag must be set to 1");
>>>
>> I put all points we discussed into the following modifcation.
>> Robbie, can you try it again for type 4 and see what else needs to be
>> done?
>>
>> Regards,
>> Huisong
>>
> Thanks Huisong, I ran my current stress test scenario against your diff
> with no issues (I did have to manually merge due to a tabs to spaces issue
> which may be totally on my end, still investigating).
>
> Here is the proposed change to support chan_in_use for type 4 (which I've
> also successfully tested with). I think I have solved the tabs to spaces
> issue for my sent messages, apologies if that's not the case.
>
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 057e00ee83c8..d4fcc913a9a8 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -292,7 +292,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> int ret;
>
> pchan = chan->con_priv;
> - if (pchan->mesg_dir == PCC_ONLY_AP_TO_SCP && !pchan->chan_in_use)
> + if (!pchan->chan_in_use)
> return IRQ_NONE;
>
> ret = pcc_chan_reg_read(&pchan->cmd_complete, &val);
> @@ -320,8 +320,16 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> goto out;
> }
>
> + /*
> + * Clearing in_use before RX callback allows calls to mbox_send
> + * (which sets in_use) within the callback so SCP to AP channels
> + * can acknowledge transfer within IRQ context
> + */
> + if (pchan->cmd_complete.gas)
> + pchan->chan_in_use = false;
> +
> mbox_chan_received_data(chan, NULL);
> - rc = IRQ_HANDLED;
> + return IRQ_HANDLED;
>
> out:
> if (pchan->cmd_complete.gas)
> @@ -772,6 +780,8 @@ static int pcc_mbox_probe(struct platform_device *pdev)
> goto err;
> }
> pcc_set_chan_mesg_dir(pchan, pcct_entry->type);
> + if (pchan->mesg_dir == PCC_ONLY_SCP_TO_AP)
> + pchan->chan_in_use = true;
>
> if (pcc_mbox_ctrl->txdone_irq) {
> rc = pcc_parse_subspace_irq(pchan, pcct_entry);
>
> .
Powered by blists - more mailing lists