linux-kernel - Re: [PATCH V4 2/5] mailbox: Add support for QTI CPUCP mailbox controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d113adc9-2192-44af-a5df-7bbaa6907ac8@quicinc.com>
Date: Tue, 14 May 2024 16:49:06 +0530
From: Sibi Sankar <quic_sibis@...cinc.com>
To: Konrad Dybcio <konrad.dybcio@...aro.org>, <sudeep.holla@....com>,
        <cristian.marussi@....com>
CC: <linux-kernel@...r.kernel.org>, <linux-arm-msm@...r.kernel.org>,
        <devicetree@...r.kernel.org>, <quic_rgottimu@...cinc.com>,
        <quic_kshivnan@...cinc.com>
Subject: Re: [PATCH V4 2/5] mailbox: Add support for QTI CPUCP mailbox
 controller



On 4/23/24 22:40, Sibi Sankar wrote:
> 
> 
> On 4/23/24 04:47, Konrad Dybcio wrote:
>>
>>
>> On 4/22/24 18:40, Sibi Sankar wrote:
>>> Add support for CPUSS Control Processor (CPUCP) mailbox controller,
>>> this driver enables communication between AP and CPUCP by acting as
>>> a doorbell between them.
>>>
>>> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
>>> Signed-off-by: Sibi Sankar <quic_sibis@...cinc.com>
>>> ---
>>
>> [...]
>>
>>> +
>>> +static int qcom_cpucp_mbox_send_data(struct mbox_chan *chan, void 
>>> *data)
>>> +{
>>> +    struct qcom_cpucp_mbox *cpucp = container_of(chan->mbox, struct 
>>> qcom_cpucp_mbox, mbox);
>>> +    unsigned long chan_id = channel_number(chan);
>>> +    u32 *val = data;
>>> +
>>> +    writel(*val, cpucp->tx_base + APSS_CPUCP_TX_MBOX_CMD(chan_id) + 
>>> APSS_CPUCP_MBOX_CMD_OFF);
>>
> 
> Hey Konrad,
> 
> Thanks for taking time to review the series.
> 
>> Just checking in, is *this access only* supposed to be 32b instead of 
>> 64 like others?
> 
> yeah, the readl and writely in the driver were used intentionally.
> 
>>
>> [...]
>>
>>> +
>>> +    writeq(0, cpucp->rx_base + APSS_CPUCP_RX_MBOX_EN);
>>> +    writeq(0, cpucp->rx_base + APSS_CPUCP_RX_MBOX_CLEAR);
>>> +    writeq(0, cpucp->rx_base + APSS_CPUCP_RX_MBOX_MAP);
>>
>> If these writes are here to prevent a possible interrupt storm type 
>> tragedy,
>> you need to read back these registers to ensure the writes have left 
>> the CPU
>> complex and reached the observer at the other end of the bus (not to be
>> confused with barriers which only ensure that such accesses are ordered
>> *when still possibly within the CPU complex*).
> 
> I couldn't find anything alluding to ^^. This sequence was just
> meant to reset the mailbox. Looks like we do need to preserve the
> ordering so relaxed read/writes aren't an option.
> 
> -Sibi
> 
>>
>> Moreover, if the order of them arriving (en/clear/mask) doesn't 
>> matter, you
>> can add _relaxed for a possible nanosecond-order perf gain
>>
>>> +
>>> +    irq = platform_get_irq(pdev, 0);
>>> +    if (irq < 0)
>>> +        return irq;
>>> +
>>> +    ret = devm_request_irq(dev, irq, qcom_cpucp_mbox_irq_fn,
>>> +                   IRQF_TRIGGER_HIGH, "apss_cpucp_mbox", cpucp);
>>> +    if (ret < 0)
>>> +        return dev_err_probe(dev, ret, "Failed to register irq: 
>>> %d\n", irq);
>>> +
>>> +    writeq(APSS_CPUCP_RX_MBOX_CMD_MASK, cpucp->rx_base + 
>>> APSS_CPUCP_RX_MBOX_MAP);
>>
>> Similarly here, unless read back, we may potentially miss some 
>> interrupts if
>> e.g. a channel is opened and that write "is decided" (by the silicon) 
>> to leave
>> the internal buffer first
> 
> At this point in time we don't expect any interrupts. They are expected
> only after channel activation. Also there were no recommendations for
> reading it back here as well.
> 
> -Sibi
> 
>>
>>
>>> +
>>> +    mbox = &cpucp->mbox;
>>> +    mbox->dev = dev;
>>> +    mbox->num_chans = APSS_CPUCP_IPC_CHAN_SUPPORTED;
>>> +    mbox->chans = cpucp->chans;
>>> +    mbox->ops = &qcom_cpucp_mbox_chan_ops;
>>> +    mbox->txdone_irq = false;
>>> +    mbox->txdone_poll = false;
>>
>> "false" == 0 is the default value (as you're using k*z*alloc)
>>
>>
>>> +
>>> +    ret = devm_mbox_controller_register(dev, mbox);
>>> +    if (ret)
>>> +        return dev_err_probe(dev, ret, "Failed to create mailbox\n");
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static const struct of_device_id qcom_cpucp_mbox_of_match[] = {
>>> +    { .compatible = "qcom,x1e80100-cpucp-mbox" },
>>> +    {}
>>> +};
>>> +MODULE_DEVICE_TABLE(of, qcom_cpucp_mbox_of_match);
>>> +
>>> +static struct platform_driver qcom_cpucp_mbox_driver = {
>>> +    .probe = qcom_cpucp_mbox_probe,
>>> +    .driver = {
>>> +        .name = "qcom_cpucp_mbox",
>>> +        .of_match_table = qcom_cpucp_mbox_of_match,
>>> +    },
>>> +};
>>> +module_platform_driver(qcom_cpucp_mbox_driver);
>>
>> That's turbo late. Go core_initcall.

Christian/Sudeep,

Looks like making the cpucp mbox as part of the core initcall and having
the vendor protocol as a module_scmi_driver causes a race as follows:

scmi_core: SCMI protocol bus registered
scmi_core: Requesting SCMI device (clocks) for protocol 14
scmi_core: Registered new scmi driver scmi-clocks
scmi_core: Requesting SCMI device (qcom_scmi_vendor_protocol) for 
protocol 80
scmi_core: Registered new scmi driver qcom-scmi-driver
scmi_core: Requesting SCMI device (perf) for protocol 13
scmi_core: Registered new scmi driver scmi-perf-domain
scmi_core: Requesting SCMI device (genpd) for protocol 11
scmi_core: Registered new scmi driver scmi-power-domain
scmi_core: Requesting SCMI device (reset) for protocol 16
scmi_core: Registered new scmi driver scmi-reset
scmi_core: Requesting SCMI device (hwmon) for protocol 15
scmi_core: Registered new scmi driver scmi-hwmon
scmi_core: Requesting SCMI device (cpufreq) for protocol 13
scmi_core: Registered new scmi driver scmi-cpufreq
scmi_module: Registered SCMI Protocol 0x10
scmi_module: Registered SCMI Protocol 0x14
scmi_module: Registered SCMI Protocol 0x13
scmi_module: Registered SCMI Protocol 0x11
scmi_module: Registered SCMI Protocol 0x16
scmi_module: Registered SCMI Protocol 0x15
scmi_module: Registered SCMI Protocol 0x17
scmi_module: Registered SCMI Protocol 0x12
scmi_module: Registered SCMI Protocol 0x18
scmi_module: Registered SCMI Protocol 0x19
scmi_core: (scmi) Created SCMI device 'scmi_dev.1' for protocol 0x10 
(__scmi_transport_device_tx_10)
scmi_core: (scmi) Created SCMI device 'scmi_dev.2' for protocol 0x10 
(__scmi_transport_device_rx_10)
arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
scmi_module: Found SCMI Protocol 0x10
arm-scmi firmware:scmi: SCMI Protocol v2.0 'Qualcomm:' Firmware version 
0x20000
scmi_module: Found SCMI Protocol 0x13
scmi_core: (scmi) Created SCMI device 'scmi_dev.3' for protocol 0x13 
(cpufreq)
scmi-perf-domain scmi_dev.4: Initialized 3 performance domains
scmi_core: (scmi) Created SCMI device 'scmi_dev.4' for protocol 0x13 (perf)
scmi_module: SCMI Protocol 0x80 not found!
scmi_core: (scmi) Created SCMI device 'scmi_dev.5' for protocol 0x80 
(qcom_scmi_vendor_protocol)
scmi_module: Registered SCMI Protocol 0x80

By the time the vendor protocol get's registered it's already reported
as not found.

-Sibi

>>
>> Konrad
>