linux-kernel - Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1b55d9d4-f3ff-4cd9-8906-5f370da55732@quicinc.com>
Date: Thu, 19 Jun 2025 21:46:21 +0530
From: Jyothi Kumar Seerapu <quic_jseerapu@...cinc.com>
To: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
CC: Vinod Koul <vkoul@...nel.org>,
        Mukesh Kumar Savaliya
	<quic_msavaliy@...cinc.com>,
        Viken Dadhaniya <quic_vdadhani@...cinc.com>,
        Andi Shyti <andi.shyti@...nel.org>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Christian König <christian.koenig@....com>,
        <linux-arm-msm@...r.kernel.org>, <dmaengine@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linux-i2c@...r.kernel.org>,
        <linux-media@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>,
        <linaro-mm-sig@...ts.linaro.org>, <quic_vtanuku@...cinc.com>
Subject: Re: [PATCH v6 2/2] i2c: i2c-qcom-geni: Add Block event interrupt
 support



On 6/18/2025 1:02 AM, Dmitry Baryshkov wrote:
> On Tue, 17 Jun 2025 at 17:11, Jyothi Kumar Seerapu
> <quic_jseerapu@...cinc.com> wrote:
>>
>>
>>
>> On 5/30/2025 10:12 PM, Dmitry Baryshkov wrote:
>>> On Fri, May 30, 2025 at 07:36:05PM +0530, Jyothi Kumar Seerapu wrote:
>>>>
>>>>
>>>> On 5/21/2025 6:15 PM, Dmitry Baryshkov wrote:
>>>>> On Wed, May 21, 2025 at 03:58:48PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>
>>>>>>
>>>>>> On 5/9/2025 9:31 PM, Dmitry Baryshkov wrote:
>>>>>>> On 09/05/2025 09:18, Jyothi Kumar Seerapu wrote:
>>>>>>>> Hi Dimitry, Thanks for providing the review comments.
>>>>>>>>
>>>>>>>> On 5/6/2025 5:16 PM, Dmitry Baryshkov wrote:
>>>>>>>>> On Tue, May 06, 2025 at 04:48:44PM +0530, Jyothi Kumar Seerapu wrote:
>>>>>>>>>> The I2C driver gets an interrupt upon transfer completion.
>>>>>>>>>> When handling multiple messages in a single transfer, this
>>>>>>>>>> results in N interrupts for N messages, leading to significant
>>>>>>>>>> software interrupt latency.
>>>>>>>>>>
>>>>>>>>>> To mitigate this latency, utilize Block Event Interrupt (BEI)
>>>>>>>>>> mechanism. Enabling BEI instructs the hardware to prevent interrupt
>>>>>>>>>> generation and BEI is disabled when an interrupt is necessary.
>>>>>>>>>>
>>>>>>>>>> Large I2C transfer can be divided into chunks of 8 messages internally.
>>>>>>>>>> Interrupts are not expected for the first 7 message completions, only
>>>>>>>>>> the last message triggers an interrupt, indicating the completion of
>>>>>>>>>> 8 messages. This BEI mechanism enhances overall transfer efficiency.
>>>>>>>>>
>>>>>>>>> Why do you need this complexity? Is it possible to set the
>>>>>>>>> DMA_PREP_INTERRUPT flag on the last message in the transfer?
>>>>>>>>
>>>>>>>> If i undertsand correctly, the suggestion is to get the single
>>>>>>>> intetrrupt for last i2c message only.
>>>>>>>>
>>>>>>>> But With this approach, we can't handle large number of i2c messages
>>>>>>>> in the transfer.
>>>>>>>>
>>>>>>>> In GPI driver, number of max TREs support is harcoded to 64 (#define
>>>>>>>> CHAN_TRES   64) and for I2C message, we need Config TRE, GO TRE and
>>>>>>>> DMA TREs. So, the avilable TREs are not sufficient to handle all the
>>>>>>>> N messages.
>>>>>>>
>>>>>>> It sounds like a DMA driver issue. In other words, the DMA driver can
>>>>>>> know that it must issue an interrupt before exausting 64 TREs in order
>>>>>>> to
>>>>>>>
>>>>>>>>
>>>>>>>> Here, the plan is to queue i2c messages (QCOM_I2C_GPI_MAX_NUM_MSGS
>>>>>>>> or 'num' incase for less messsages), process and unmap/free upon the
>>>>>>>> interrupt based on QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>>>>
>>>>>>> Why? This is some random value which has no connection with CHAN_TREs.
>>>>>>> Also, what if one of the platforms get a 'liter' GPI which supports less
>>>>>>> TREs in a single run? Or a super-premium platform which can use 256
>>>>>>> TREs? Please don't workaround issues from one driver in another one.
>>>>>>
>>>>>> We are trying to utilize the existing CHAN_TRES mentioned in the GPI driver.
>>>>>> With the following approach, the GPI hardware can process N number of I2C
>>>>>> messages, thereby improving throughput and transfer efficiency.
>>>>>>
>>>>>> The main design consideration for using the block event interrupt is as
>>>>>> follows:
>>>>>>
>>>>>> Allow the hardware to process the TREs (I2C messages), while the software
>>>>>> concurrently prepares the next set of TREs to be submitted to the hardware.
>>>>>> Once the TREs are processed, they can be freed, enabling the software to
>>>>>> queue new TREs. This approach enhances overall optimization.
>>>>>>
>>>>>> Please let me know if you have any questions, concerns, or suggestions.
>>>>>
>>>>> The question was why do you limit that to QCOM_I2C_GPI_NUM_MSGS_PER_IRQ.
>>>>> What is the reason for that limit, etc. If you think about it, The GENI
>>>>> / I2C doesn't impose any limit on the number of messages processed in
>>>>> one go (if I understand it correctly). Instead the limit comes from the
>>>>> GPI DMA driver. As such, please don't add extra 'handling' to the I2C
>>>>> driver. Make GPI DMA driver responsible for saying 'no more for now',
>>>>> then I2C driver can setup add an interrupt flag and proceed with
>>>>> submitting next messages, etc.
>>>>>
>>>>
>>>> For I2C messages, we need to prepare TREs for Config, Go and DMAs. However,
>>>> if a large number of I2C messages are submitted then may may run out of
>>>> memory for serving the TREs. The GPI channel supports a maximum of 64 TREs,
>>>> which is insufficient to serve 32 or even 16 I2C messages concurrently,
>>>> given the multiple TREs required per message.
>>>>
>>>> To address this limitation, a strategy has been implemented to manage how
>>>> many messages can be queued and how memory is recycled. The constant
>>>> QCOM_I2C_GPI_MAX_NUM_MSGS is set to 16, defining the upper limit of
>>>> messages that can be queued at once. Additionally,
>>>> QCOM_I2C_GPI_NUM_MSGS_PER_IRQ is set to 8, meaning that
>>>> half of the queued messages are expected to be freed or deallocated per
>>>> interrupt.
>>>> This approach ensures that the driver can efficiently manage TRE resources
>>>> and continue queuing new I2C messages without exhausting memory.
>>>>> I really don't see a reason for additional complicated handling in the
>>>>> geni driver that you've implemented. Maybe I misunderstand something. In
>>>>> such a case it usually means that you have to explain the design in the
>>>>> commit message / in-code comments.
>>>>>
>>>>
>>>>
>>>> The I2C Geni driver is designed to prepare and submit descriptors to the GPI
>>>> driver one message at a time.
>>>> As a result, the GPI driver does not have visibility into the current
>>>> message index or the total number of I2C messages in a transfer. This lack
>>>> of context makes it challenging to determine when to set the block event
>>>> interrupt, which is typically used to signal the completion of a batch of
>>>> messages.
>>>>
>>>> So, the responsibility for deciding when to set the BEI should lie with the
>>>> I2C driver.
>>>>
>>>> If this approach is acceptable, I will proceed with updating the relevant
>>>> details in the commit message.
>>>>
>>>> Please let me know if you have any concerns or suggestions.
>>>
>> Hi Dmitry, Sorry for the delayed response, and thank you for the
>> suggestions.
>>
>>> - Make gpi_prep_slave_sg() return NULL if flags don't have
>>>     DMA_PREP_INTERRUPT flag and there are no 3 empty TREs for the
>>>     interrupt-enabled transfer.
>> "there are no 3 empty TREs for the interrupt-enabled transfer."
>> Could you please help me understand this a bit better?
> 
> In the GPI driver you know how many TREs are available. In
> gpi_prep_slave_sg() you can check that and return an error if there
> are not enough TREs available.
> 
>>>
>>> - If I2C driver gets NULL from dmaengine_prep_slave_single(), retry
>>>     again, adding DMA_PREP_INTERRUPT. Make sure that the last one always
>>>     gets DMA_PREP_INTERRUPT.
>> Does this mean we need to proceed to the next I2C message and ensure
>> that the DMA_PREP_INTERRUPT flag is set for the last I2C message in each
>> chunk? And then, should we submit the chunk of messages to the GSI
>> hardware for processing?
> 
> No. You don't have to peek at the next I2C message. This all concerns
> the current I2C message. The only point where you have to worry is to
> explicitly set the flag for the last message.
> 
>>
>>>
>>> - In geni_i2c_gpi_xfer() split the loop to submit messages until you
>>>     can, then call wait_for_completion_timeout() and then
>>>     geni_i2c_gpi_unmap() for submitted messages, then continue with a new
>>>     portion of messages.
>> Since the GPI channel supports a maximum of 64 TREs, should we consider
>> submitting a smaller number of predefined messages — perhaps fewer than
>> 32, such as 16?
> 
> Why? Just submit messages until they fit, then flush the DMA async channel.
> 
>> This is because handling 32 messages would require one TRE for config
>> and 64 TREs for the Go and DMA preparation steps, which exceeds the
>> channel's TRE capacity of 64.
>>
>> We designed the approach to submit a portion of the messages — for
>> example, 16 at a time. Once 8 messages are processed and freed, the
>> hardware can continue processing the TREs, while the software
>> simultaneously prepares the next set of TREs. This parallelism helps in
>> efficiently utilizing the hardware and enhances overall system
>> optimization.
> 
> 
> And this overcomplicates the driver and introduces artificial
> limitations which need explanation. Please fix it in a simple way
> first. Then you can e.g. implement the watermark at the half of the
> GPI channel depth and request DMA_PREP_INTERRUPT to be set in the
> middle of the full sequence, allowing it to be used asynchronously in
> the background.
> 

Okay, will review it. Thanks.