linux-kernel - Re: [PATCH v2 RESEND 3/3] i2c: i2c-qcom-geni: Add Block event interrupt support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <499b32c0-480f-422d-8a2d-3972409a187f@quicinc.com>
Date: Mon, 2 Dec 2024 19:11:02 +0530
From: Jyothi Kumar Seerapu <quic_jseerapu@...cinc.com>
To: Bjorn Andersson <andersson@...nel.org>
CC: Vinod Koul <vkoul@...nel.org>, Andi Shyti <andi.shyti@...nel.org>,
        "Sumit
 Semwal" <sumit.semwal@...aro.org>,
        Christian König
	<christian.koenig@....com>,
        <linux-arm-msm@...r.kernel.org>, <dmaengine@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-i2c@...r.kernel.org>,
        <linux-media@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>,
        <linaro-mm-sig@...ts.linaro.org>, <quic_msavaliy@...cinc.com>,
        <quic_vtanuku@...cinc.com>
Subject: Re: [PATCH v2 RESEND 3/3] i2c: i2c-qcom-geni: Add Block event
 interrupt support



On 11/21/2024 6:28 PM, Jyothi Kumar Seerapu wrote:
> 
> 
> On 11/12/2024 10:03 AM, Bjorn Andersson wrote:
>> On Mon, Nov 11, 2024 at 07:32:44PM +0530, Jyothi Kumar Seerapu wrote:
>>> The I2C driver gets an interrupt upon transfer completion.
>>> For multiple messages in a single transfer, N interrupts will be
>>> received for N messages, leading to significant software interrupt
>>> latency. To mitigate this latency, utilize Block Event Interrupt (BEI)
>>
>> Please rewrite this to the tone that the reader doesn't know what Block
>> Event Interrupt is, or that it exists.
> Sure, done.
>>
>>> only when an interrupt is necessary. This means large transfers can be
>>> split into multiple chunks of 8 messages internally, without expecting
>>> interrupts for the first 7 message completions, only the last one will
>>> trigger an interrupt indicating 8 messages completed.
>>>
>>> By implementing BEI, multi-message transfers can be divided into
>>> chunks of 8 messages, improving overall transfer time.
>>
>> You already wrote this in the paragraph above.
> yeah removed it.
>>
>>
>> Where is this number 8 coming from btw?
> Its documented in "qcom-gpi-dma.h" file.
> Trigger an interrupt, after the completion of 8 messages.
>>
>>> This optimization reduces transfer time from 168 ms to 48 ms for a
>>> series of 200 I2C write messages in a single transfer, with a
>>> clock frequency support of 100 kHz.
>>>
>>> BEI optimizations are currently implemented for I2C write transfers 
>>> only,
>>> as there is no use case for multiple I2C read messages in a single 
>>> transfer
>>> at this time.
>>>
>>> Signed-off-by: Jyothi Kumar Seerapu <quic_jseerapu@...cinc.com>
>>> ---
>>>
>>> v1 -> v2:
>>>     - Moved gi2c_gpi_xfer->msg_idx_cnt to separate local variable.
>>>     - Updated goto labels for error scenarios in geni_i2c_gpi function
>>>     - memset tx_multi_xfer to 0.
>>>     - Removed passing current msg index to geni_i2c_gpi.
>>>     - Fixed kernel test robot reported compilation issues.
>>>
>>>   drivers/i2c/busses/i2c-qcom-geni.c | 203 +++++++++++++++++++++++++----
>>>   1 file changed, 178 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/drivers/i2c/busses/i2c-qcom-geni.c 
>>> b/drivers/i2c/busses/i2c-qcom-geni.c
>>> index 7a22e1f46e60..04a7d926dadc 100644
>>> --- a/drivers/i2c/busses/i2c-qcom-geni.c
>>> +++ b/drivers/i2c/busses/i2c-qcom-geni.c
>>> @@ -100,6 +100,10 @@ struct geni_i2c_dev {
>>>       struct dma_chan *rx_c;
>>>       bool gpi_mode;
>>>       bool abort_done;
>>> +    bool is_tx_multi_xfer;
>>> +    u32 num_msgs;
>>> +    u32 tx_irq_cnt;
>>> +    struct gpi_i2c_config *gpi_config;
>>>   };
>>>   struct geni_i2c_desc {
>>> @@ -500,6 +504,7 @@ static int geni_i2c_tx_one_msg(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>>   static void i2c_gpi_cb_result(void *cb, const struct 
>>> dmaengine_result *result)
>>>   {
>>>       struct geni_i2c_dev *gi2c = cb;
>>> +    struct gpi_multi_xfer *tx_multi_xfer;
>>>       if (result->result != DMA_TRANS_NOERROR) {
>>>           dev_err(gi2c->se.dev, "DMA txn failed:%d\n", result->result);
>>> @@ -508,7 +513,21 @@ static void i2c_gpi_cb_result(void *cb, const 
>>> struct dmaengine_result *result)
>>>           dev_dbg(gi2c->se.dev, "DMA xfer has pending: %d\n", 
>>> result->residue);
>>>       }
>>> -    complete(&gi2c->done);
>>> +    if (gi2c->is_tx_multi_xfer) {
>>
>> Wouldn't it be cleaner to treat the !is_tx_multi_xfer case as a
>> multi-xfer of length 1?
> Sure, addressed the change in V3 patch.
>>
>>> +        tx_multi_xfer = &gi2c->gpi_config->multi_xfer;
>>> +
>>> +        /*
>>> +         * Send Completion for last message or multiple of 
>>> NUM_MSGS_PER_IRQ.
>>> +         */
>>> +        if ((tx_multi_xfer->irq_msg_cnt == gi2c->num_msgs - 1) ||
>>> +            (!((tx_multi_xfer->irq_msg_cnt + 1) % NUM_MSGS_PER_IRQ))) {
>>> +            tx_multi_xfer->irq_cnt++;
>>> +            complete(&gi2c->done);
>>
>> Why? You're removing the wait_for_completion_timeout() from
>> geni_i2c_gpi_xfer() when is_tx_multi_xfer is set.
> For (!is_tx_multi_xfer) case, need to wait for every message.
> But whereas for multi-message (when is_tx_multi_xfer is set) cases, 
> "wait_for_completion_timeout" will trigger after queuing messages till 
> QCOM_GPI_MAX_NUM_MSGS (32) or total number of i2c msgs and 
> "wait_for_completion_timeout" for this case is handled in GPI driver.
>>
>>> +        }
>>> +        tx_multi_xfer->irq_msg_cnt++;
>>> +    } else {
>>> +        complete(&gi2c->done);
>>> +    }
>>>   }
>>>   static void geni_i2c_gpi_unmap(struct geni_i2c_dev *gi2c, struct 
>>> i2c_msg *msg,
>>> @@ -526,7 +545,42 @@ static void geni_i2c_gpi_unmap(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>>       }
>>>   }
>>> -static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg *msg,
>>> +/**
>>> + * gpi_i2c_multi_desc_unmap() - unmaps the buffers post multi 
>>> message TX transfers
>>> + * @dev: pointer to the corresponding dev node
>>> + * @gi2c: i2c dev handle
>>> + * @msgs: i2c messages array
>>> + * @peripheral: pointer to the gpi_i2c_config
>>> + */
>>> +static void gpi_i2c_multi_desc_unmap(struct geni_i2c_dev *gi2c, 
>>> struct i2c_msg msgs[],
>>> +                     struct gpi_i2c_config *peripheral)
>>> +{
>>> +    u32 msg_xfer_cnt, wr_idx = 0;
>>> +    struct gpi_multi_xfer *tx_multi_xfer = &peripheral->multi_xfer;
>>> +
>>> +    /*
>>> +     * In error case, need to unmap all messages based on the 
>>> msg_idx_cnt.
>>> +     * Non-error case unmap all the processed messages.
>>
>> What is the benefit of this optimization, compared to keeping things
>> simple and just unmap all buffers at the end of geni_i2c_gpi_xfer()?
> 
> The maximum number of messages can allocate and submit to GSI hardware 
> is 16 (QCOM_GPI_MAX_NUM_MSGS) and to handle more messages beyond this 
> need to unmap the processed messages.
> If there is 200 messages or more in a transfer then we need to unmap the 
> processed messages for handling all messages in a transfer.
> So, instead of Unmap all messages together, here unmapping the chunk of 
> messages.
> 
>>
>>> +     */
>>> +    if (gi2c->err)
>>> +        msg_xfer_cnt = tx_multi_xfer->msg_idx_cnt;
>>> +    else
>>> +        msg_xfer_cnt = tx_multi_xfer->irq_cnt * NUM_MSGS_PER_IRQ;
>>> +
>>> +    /* Unmap the processed DMA buffers based on the received 
>>> interrupt count */
>>> +    for (; tx_multi_xfer->unmap_msg_cnt < msg_xfer_cnt; 
>>> tx_multi_xfer->unmap_msg_cnt++) {
>>> +        if (tx_multi_xfer->unmap_msg_cnt == gi2c->num_msgs)
>>> +            break;
>>> +        wr_idx = tx_multi_xfer->unmap_msg_cnt % QCOM_GPI_MAX_NUM_MSGS;
>>> +        geni_i2c_gpi_unmap(gi2c, &msgs[tx_multi_xfer->unmap_msg_cnt],
>>> +                   tx_multi_xfer->dma_buf[wr_idx],
>>> +                   tx_multi_xfer->dma_addr[wr_idx],
>>> +                   NULL, (dma_addr_t)NULL);
>>> +        tx_multi_xfer->freed_msg_cnt++;
>>> +    }
>>> +}
>>> +
>>> +static int geni_i2c_gpi(struct geni_i2c_dev *gi2c, struct i2c_msg 
>>> msgs[],
>>>               struct dma_slave_config *config, dma_addr_t *dma_addr_p,
>>>               void **buf, unsigned int op, struct dma_chan *dma_chan)
>>>   {
>>> @@ -538,26 +592,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg *msg,
>>>       enum dma_transfer_direction dma_dirn;
>>>       struct dma_async_tx_descriptor *desc;
>>>       int ret;
>>> +    struct gpi_multi_xfer *gi2c_gpi_xfer;
>>> +    dma_cookie_t cookie;
>>> +    u32 msg_idx;
>>>       peripheral = config->peripheral_config;
>>> -
>>> -    dma_buf = i2c_get_dma_safe_msg_buf(msg, 1);
>>> -    if (!dma_buf)
>>> -        return -ENOMEM;
>>> +    gi2c_gpi_xfer = &peripheral->multi_xfer;
>>> +    dma_buf = gi2c_gpi_xfer->dma_buf[gi2c_gpi_xfer->buf_idx];
>>> +    addr = gi2c_gpi_xfer->dma_addr[gi2c_gpi_xfer->buf_idx];
>>> +    msg_idx = gi2c_gpi_xfer->msg_idx_cnt;
>>> +
>>> +    dma_buf = i2c_get_dma_safe_msg_buf(&msgs[msg_idx], 1);
>>> +    if (!dma_buf) {
>>> +        ret = -ENOMEM;
>>> +        goto out;
>>> +    }
>>>       if (op == I2C_WRITE)
>>>           map_dirn = DMA_TO_DEVICE;
>>>       else
>>>           map_dirn = DMA_FROM_DEVICE;
>>> -    addr = dma_map_single(gi2c->se.dev->parent, dma_buf, msg->len, 
>>> map_dirn);
>>> +    addr = dma_map_single(gi2c->se.dev->parent, dma_buf,
>>> +                  msgs[msg_idx].len, map_dirn);
>>>       if (dma_mapping_error(gi2c->se.dev->parent, addr)) {
>>> -        i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>>> -        return -ENOMEM;
>>> +        i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>>> +        ret = -ENOMEM;
>>> +        goto out;
>>> +    }
>>> +
>>> +    if (gi2c->is_tx_multi_xfer) {
>>> +        if (((msg_idx + 1) % NUM_MSGS_PER_IRQ))
>>> +            peripheral->flags |= QCOM_GPI_BLOCK_EVENT_IRQ;
>>> +        else
>>> +            peripheral->flags &= ~QCOM_GPI_BLOCK_EVENT_IRQ;
>>> +
>>> +        /* BEI bit to be cleared for last TRE */
>>> +        if (msg_idx == gi2c->num_msgs - 1)
>>> +            peripheral->flags &= ~QCOM_GPI_BLOCK_EVENT_IRQ;
>>>       }
>>>       /* set the length as message for rx txn */
>>> -    peripheral->rx_len = msg->len;
>>> +    peripheral->rx_len = msgs[msg_idx].len;
>>>       peripheral->op = op;
>>>       ret = dmaengine_slave_config(dma_chan, config);
>>> @@ -575,7 +651,8 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg *msg,
>>>       else
>>>           dma_dirn = DMA_DEV_TO_MEM;
>>> -    desc = dmaengine_prep_slave_single(dma_chan, addr, msg->len, 
>>> dma_dirn, flags);
>>> +    desc = dmaengine_prep_slave_single(dma_chan, addr, 
>>> msgs[msg_idx].len,
>>> +                       dma_dirn, flags);
>>>       if (!desc) {
>>>           dev_err(gi2c->se.dev, "prep_slave_sg failed\n");
>>>           ret = -EIO;
>>> @@ -585,15 +662,48 @@ static int geni_i2c_gpi(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg *msg,
>>>       desc->callback_result = i2c_gpi_cb_result;
>>>       desc->callback_param = gi2c;
>>> -    dmaengine_submit(desc);
>>> -    *buf = dma_buf;
>>> -    *dma_addr_p = addr;
>>> +    if (!((msgs[msg_idx].flags & I2C_M_RD) && op == I2C_WRITE)) {
>>> +        gi2c_gpi_xfer->msg_idx_cnt++;
>>> +        gi2c_gpi_xfer->buf_idx = (msg_idx + 1) % QCOM_GPI_MAX_NUM_MSGS;
>>> +    }
>>> +    cookie = dmaengine_submit(desc);
>>> +    if (dma_submit_error(cookie)) {
>>> +        dev_err(gi2c->se.dev,
>>> +            "%s: dmaengine_submit failed (%d)\n", __func__, cookie);
>>> +        ret = -EINVAL;
>>> +        goto err_config;
>>> +    }
>>> +    if (gi2c->is_tx_multi_xfer) {
>>> +        dma_async_issue_pending(gi2c->tx_c);
>>> +        if ((msg_idx == (gi2c->num_msgs - 1)) ||
>>> +            (gi2c_gpi_xfer->msg_idx_cnt >=
>>> +             QCOM_GPI_MAX_NUM_MSGS + gi2c_gpi_xfer->freed_msg_cnt)) {
>>> +            ret = gpi_multi_desc_process(gi2c->se.dev, gi2c_gpi_xfer,
>>
>> A function call straight into the GPI driver? I'm not entirely familiar
>> with the details of the dmaengine API, but this doesn't look correct.
> 
> "gpi_multi_desc_process" can be used for the other protocols as well and 
> so defined here. Please let me know if its not a good idea to make this 
> as common function for all required protocols and keep in GPI driver.
> 
> Also, gpi_multi_desc_process can't be fit into dmaengine API and so 
> invoked a function call to GPI driver.

Hi Bjorn, this function(gpi_multi_desc_process) does not fit into any 
DMA engine API.
So, I am considering moving this function to the I2C driver from GPI. 
Please let me know if this is acceptable or if you have any suggestions.
>>
>>> +                             gi2c->num_msgs, XFER_TIMEOUT,
>>> +                             &gi2c->done);
>>> +            if (ret) {
>>> +                dev_err(gi2c->se.dev,
>>> +                    "I2C multi write msg transfer timeout: %d\n",
>>> +                    ret);
>>> +                gi2c->err = ret;
>>> +                goto err_config;
>>> +            }
>>> +        }
>>> +    } else {
>>> +        /* Non multi descriptor message transfer */
>>> +        *buf = dma_buf;
>>> +        *dma_addr_p = addr;
>>> +    }
>>>       return 0;
>>>   err_config:
>>> -    dma_unmap_single(gi2c->se.dev->parent, addr, msg->len, map_dirn);
>>> -    i2c_put_dma_safe_msg_buf(dma_buf, msg, false);
>>> +    dma_unmap_single(gi2c->se.dev->parent, addr,
>>> +             msgs[msg_idx].len, map_dirn);
>>> +    i2c_put_dma_safe_msg_buf(dma_buf, &msgs[msg_idx], false);
>>> +
>>> +out:
>>> +    gi2c->err = ret;
>>>       return ret;
>>>   }
>>> @@ -605,6 +715,7 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg msgs[], i
>>>       unsigned long time_left;
>>>       dma_addr_t tx_addr, rx_addr;
>>>       void *tx_buf = NULL, *rx_buf = NULL;
>>> +    struct gpi_multi_xfer *tx_multi_xfer;
>>>       const struct geni_i2c_clk_fld *itr = gi2c->clk_fld;
>>>       config.peripheral_config = &peripheral;
>>> @@ -618,6 +729,34 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg msgs[], i
>>>       peripheral.set_config = 1;
>>>       peripheral.multi_msg = false;
>>> +    gi2c->gpi_config = &peripheral;
>>> +    gi2c->num_msgs = num;
>>> +    gi2c->is_tx_multi_xfer = false;
>>> +    gi2c->tx_irq_cnt = 0;
>>> +
>>> +    tx_multi_xfer = &peripheral.multi_xfer;
>>> +    memset(tx_multi_xfer, 0, sizeof(struct gpi_multi_xfer));
>>> +
>>> +    /*
>>> +     * If number of write messages are four and higher then
>>
>> Why four?
> It changed to 2 in V3, so that if the number of messages in a transfer 
> are greter than 1, then "is_tx_multi_xfer" is set.
>>
>>> +     * configure hardware for multi descriptor transfers with BEI.
>>> +     */
>>> +    if (num >= MIN_NUM_OF_MSGS_MULTI_DESC) {
>>> +        gi2c->is_tx_multi_xfer = true;
>>> +        for (i = 0; i < num; i++) {
>>> +            if (msgs[i].flags & I2C_M_RD) {
>>> +                /*
>>> +                 * Multi descriptor transfer with BEI
>>> +                 * support is enabled for write transfers.
>>> +                 * Add BEI optimization support for read
>>> +                 * transfers later.
>>
>> Prefix this comment with "TODO:"
> Done
>>
>>> +                 */
>>> +                gi2c->is_tx_multi_xfer = false;
>>> +                break;
>>> +            }
>>> +        }
>>> +    }
>>> +
>>>       for (i = 0; i < num; i++) {
>>>           gi2c->cur = &msgs[i];
>>>           gi2c->err = 0;
>>> @@ -628,14 +767,16 @@ static int geni_i2c_gpi_xfer(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>               peripheral.stretch = 1;
>>>           peripheral.addr = msgs[i].addr;
>>> +        if (i > 0 && (!(msgs[i].flags & I2C_M_RD)))
>>> +            peripheral.multi_msg = false;
>>> -        ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>>> +        ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>>                       &tx_addr, &tx_buf, I2C_WRITE, gi2c->tx_c);
>>>           if (ret)
>>>               goto err;
>>>           if (msgs[i].flags & I2C_M_RD) {
>>> -            ret =  geni_i2c_gpi(gi2c, &msgs[i], &config,
>>> +            ret =  geni_i2c_gpi(gi2c, msgs, &config,
>>>                           &rx_addr, &rx_buf, I2C_READ, gi2c->rx_c);
>>>               if (ret)
>>>                   goto err;
>>> @@ -643,18 +784,26 @@ static int geni_i2c_gpi_xfer(struct 
>>> geni_i2c_dev *gi2c, struct i2c_msg msgs[], i
>>>               dma_async_issue_pending(gi2c->rx_c);
>>>           }
>>> -        dma_async_issue_pending(gi2c->tx_c);
>>> -
>>> -        time_left = wait_for_completion_timeout(&gi2c->done, 
>>> XFER_TIMEOUT);
>>> -        if (!time_left)
>>> -            gi2c->err = -ETIMEDOUT;
>>> +        if (!gi2c->is_tx_multi_xfer) {
>>> +            dma_async_issue_pending(gi2c->tx_c);
>>> +            time_left = wait_for_completion_timeout(&gi2c->done, 
>>> XFER_TIMEOUT);
>>
>> By making this conditional on !is_tx_multi_xfer transfers, what makes
>> the loop wait for the transfer to complete before you below unmap the
>> buffers?
> Yes, for (!is_tx_multi_xfer) case, need to wait for every message and 
> then unmap it and for is_tx_multi_xfer transfers shouldn't unmap per 
> message wise instead unmap the chunk of messages together.
> 
>>
>>> +            if (!time_left) {
>>> +                dev_err(gi2c->se.dev, "%s:I2C timeout\n", __func__);
>>> +                gi2c->err = -ETIMEDOUT;
>>> +            }
>>> +        }
>>>           if (gi2c->err) {
>>>               ret = gi2c->err;
>>>               goto err;
>>>           }
>>> -        geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>> rx_addr);
>>> +        if (!gi2c->is_tx_multi_xfer) {
>>> +            geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, 
>>> rx_buf, rx_addr);
>>> +        } else if (gi2c->tx_irq_cnt != tx_multi_xfer->irq_cnt) {
>>> +            gi2c->tx_irq_cnt = tx_multi_xfer->irq_cnt;
>>> +            gpi_i2c_multi_desc_unmap(gi2c, msgs, &peripheral);
>>> +        }
>>>       }
>>>       return num;
>>> @@ -663,7 +812,11 @@ static int geni_i2c_gpi_xfer(struct geni_i2c_dev 
>>> *gi2c, struct i2c_msg msgs[], i
>>>       dev_err(gi2c->se.dev, "GPI transfer failed: %d\n", ret);
>>>       dmaengine_terminate_sync(gi2c->rx_c);
>>>       dmaengine_terminate_sync(gi2c->tx_c);
>>> -    geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>> rx_addr);
>>> +    if (gi2c->is_tx_multi_xfer)
>>> +        gpi_i2c_multi_desc_unmap(gi2c, msgs, &peripheral);
>>> +    else
>>> +        geni_i2c_gpi_unmap(gi2c, &msgs[i], tx_buf, tx_addr, rx_buf, 
>>> rx_addr);
>>> +
>>
>> As above, it would be nice if multi-xfer was just a special case with a
>> single buffer; rather than inflating the cyclomatic complexity.
> 
> For a single i2c message, data can be placed at contigious memory 
> locations, but for multiple i2c messages in a transfer, all the messages 
> offsets and data may not guarantee to placed at contigious memory 
> locations.
> So, looks single large buffer is not helpful here.
> 
>>
>> Regards,
>> Bjorn
>>
>>>       return ret;
>>>   }
>>> -- 
>>> 2.17.1
>>>
>>>