linux-kernel - Re: [PATCH v5 09/13] mailbox: Add Gunyah message queue mailbox

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4a8557e-3fe7-356c-9434-01263f6d9771@linaro.org>
Date:   Mon, 17 Oct 2022 11:43:41 +0300
From:   Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To:     Elliot Berman <quic_eberman@...cinc.com>,
        Bjorn Andersson <quic_bjorande@...cinc.com>,
        Jassi Brar <jassisinghbrar@...il.com>
Cc:     Murali Nalajala <quic_mnalajal@...cinc.com>,
        Trilok Soni <quic_tsoni@...cinc.com>,
        Srivatsa Vaddagiri <quic_svaddagi@...cinc.com>,
        Carl van Schaik <quic_cvanscha@...cinc.com>,
        Prakruthi Deepak Heragu <quic_pheragu@...cinc.com>,
        Andy Gross <agross@...nel.org>,
        linux-arm-kernel@...ts.infradead.org,
        Mark Rutland <mark.rutland@....com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Sudeep Holla <sudeep.holla@....com>,
        Marc Zyngier <maz@...nel.org>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Jonathan Corbet <corbet@....net>,
        Will Deacon <will@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Arnd Bergmann <arnd@...db.de>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        devicetree@...r.kernel.org, linux-doc@...r.kernel.org,
        linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 09/13] mailbox: Add Gunyah message queue mailbox

On 14/10/2022 01:32, Elliot Berman wrote:
> 
> 
> On 10/12/2022 2:47 PM, Dmitry Baryshkov wrote:
>> On 11/10/2022 03:08, Elliot Berman wrote:
>>> +
>>> +static irqreturn_t gh_msgq_tx_irq_handler(int irq, void *data)
>>> +{
>>> +    struct gunyah_msgq *msgq = data;
>>> +
>>> +    mbox_chan_txdone(gunyah_msgq_chan(msgq), 0);
>>> +
>>> +    return IRQ_HANDLED;
>>> +}
>>> +
>>> +static void gh_msgq_txdone_tasklet(unsigned long data)
>>> +{
>>> +    struct gunyah_msgq *msgq = (struct gunyah_msgq *)data;
>>> +
>>> +    mbox_chan_txdone(gunyah_msgq_chan(msgq), msgq->last_status);
>>
>> I don't quite get this. Why do you need both an IRQ and a tasklet?
>>
> 
> I've now tweaked the code comments now as well to explain a bit better.
> 
> Gunyah tells us in the hypercall itself whether the message queue is 
> full. Once the the message queue is full, Gunyah will let us know when 
> reader starts draining the queue and we can start adding more messages 
> via the tx_irq.
> 
> One point to note: the last message to be sent into the message queue 
> that makes the queue full can be detected. The hypercall reports that 
> the message was sent (GH_ERROR_OK) and the "ready" return value is 
> false. In its current form, the msgq mailbox driver should never make a 
> send hypercall and get GH_ERROR_MSGQUEUE_FULL because the driver 
> properly track when the message queue is full.
> 
> When mailbox driver reports txdone, the implication is that more 
> messages can be sent (not just that the message was transmitted). In 
> typical operation, the msgq mailbox driver can immediately report that 
> the message was sent and no tx_irq happens because the hypercall returns 
> GH_ERROR_OK and ready=true. The mailbox framework doesn't allow txdone 
> directly from the send_data callback. To work around that, Jassi 
> recommended we use tasklet [1]. In the "atypical" case where message 
> queue becomes full, we get GH_ERROR_OK and ready=false. In that case, we 
> don't report txdone right away with the tasklet and instead wait for the 
> tx_irq to know when more messages can be sent.

Can we please get some sort of this information into the comments in the 
source file?

> 
> [1]: Tasklet works because send_data is called from mailbox framework 
> with interrupts disabled. Once interrupts are re-enabled, the txdone is 
> allowed to happen which is also when tasklet runs.
> 
>>> +
>>> +    /**
>>> +     * EAGAIN: message didn't send.
>>> +     * ret = 1: message sent, but now the message queue is full and 
>>> we can't send any more msgs.
>>> +     * Either way, don't report that this message is done.
>>> +     */
>>> +    if (ret == -EAGAIN || ret == 1)
>>> +        return ret;
>>
>> '1' doesn't seem to be a valid return code for _send_data.
>>
>> Also it would be logical to return any error here, not just -EAGAIN.
>>
> 
> 
> If I return error to mailbox framework, then the message is stuck: 
> clients don't know that there was some underlying transport failure. It 
> would be retried if the client sends another message, but there is no 
> guarantee that either retrying later would work (what would have 
> changed?) nor that client would send another message to trigger retry. 
> If the message is malformed or message queue not correctly set up, 
> client would never know. Client should be told that the message wasn't 
> sent.

I see. msg_submit() doesn't propagate the error.

> 
> 
>>> +int gunyah_msgq_init(struct device *parent, struct gunyah_msgq 
>>> *msgq, struct mbox_client *cl,
>>> +             struct gunyah_resource *tx_ghrsc, struct 
>>> gunyah_resource *rx_ghrsc)
>>
>> Are the message queues allocated/created dynamically or statically? If 
>> the later is true, please use devm_request(_threaded)_irq and 
>> devm_kzalloc.
>>
> 
> With the exception of resource manager, message queues are created 
> dynamically.
> 
> P.S. Thanks for all the other suggestions in this and the other patches, 
> I've applied them.
> 
> Thanks,
> Elliot

-- 
With best wishes
Dmitry