linux-kernel - Re: [PATCH v1 3/7] spi: qspi-tegra: Add support for Tegra210 QSPI controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be8291fc-8e69-b577-d8f4-20aeca0b45cc@nvidia.com>
Date:   Thu, 3 Dec 2020 16:22:54 -0800
From:   Sowjanya Komatineni <skomatineni@...dia.com>
To:     Mark Brown <broonie@...nel.org>
CC:     <thierry.reding@...il.com>, <jonathanh@...dia.com>,
        <robh+dt@...nel.org>, <linux-spi@...r.kernel.org>,
        <linux-tegra@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <devicetree@...r.kernel.org>
Subject: Re: [PATCH v1 3/7] spi: qspi-tegra: Add support for Tegra210 QSPI
 controller


On 12/2/20 11:17 AM, Sowjanya Komatineni wrote:
>
> On 12/2/20 9:27 AM, Mark Brown wrote:
>> On Tue, Dec 01, 2020 at 01:12:44PM -0800, Sowjanya Komatineni wrote:
>>> Tegra SoC has a Quad SPI controller starting from Tegra210.
>>>
>>> This patch adds support for Tegra210 QSPI controller.
>> This looks pretty clean but I've got a few questions below about how
>> this integrates with the frameworks as well as some more minor issues.
>>
>>> +config QSPI_TEGRA
>>> +    tristate "Nvidia Tegra QSPI Controller"
>> Everything else in this file is SPI_, even the qspi controllers.
> Will rename in v2
>>> +++ b/drivers/spi/qspi-tegra.c
>>> @@ -0,0 +1,1418 @@
>>> +// SPDX-License-Identifier: GPL-2.0-only
>>> +/*
>>> + * Copyright (C) 2020 NVIDIA CORPORATION.  All rights reserved.
>>> + */
>> Please make the entire comment a C++ one.  It also appears that the "All
>> rights reserved" here conflicts with the GPL-2.0-only SPDX statement...
> Will fix in v2
>>
>>> +static void
>>> +tegra_qspi_copy_client_txbuf_to_qspi_txbuf(struct tegra_qspi_data 
>>> *tqspi,
>>> +                       struct spi_transfer *t)
>>> +{
>>> +    /* Make the dma buffer to read by cpu */
>>> +    dma_sync_single_for_cpu(tqspi->dev, tqspi->tx_dma_phys,
>>> +                tqspi->dma_buf_size, DMA_TO_DEVICE);
>>> +
>>> +    if (tqspi->is_packed) {
>>> +        unsigned int len = tqspi->curr_dma_words *
>>> +                   tqspi->bytes_per_word;
>>> +
>>> +        memcpy(tqspi->tx_dma_buf, t->tx_buf + tqspi->cur_pos, len);
>>> +        tqspi->cur_tx_pos += tqspi->curr_dma_words *
>>> +                     tqspi->bytes_per_word;
>> It seems weird that this device needs us to do a memcpy() to do DMA,
>> most devices are able to DMA directly from the buffers provided by the
>> SPI API (and let the SPI core sync things).  What is going on here?
>
> For transfers of size more than max DMA transfer limit, data transfer 
> happens in multiple iterations with each iteration transferring up to 
> max DMA transfer limit.
>
> So using separate dma buffers and on every iteration copying them to 
> SPI core provided tx/rx buffers.
Also unpack mode needs to manually put the bytes together from read data 
to SPI core rx buffer.
>
> Transferring data logic in this driver is similar as Tegra SPI driver 
> except register changes and some QSPI specific register programming.
>
>>
>>> +    tegra_qspi_writel(tqspi, status, QSPI_FIFO_STATUS);
>>> +    while ((status & QSPI_FIFO_EMPTY) != QSPI_FIFO_EMPTY) {
>>> +        status = tegra_qspi_readl(tqspi, QSPI_FIFO_STATUS);
>>> +        if (time_after(jiffies, timeout)) {
>>> +            dev_err(tqspi->dev,
>>> +                "timeout waiting for fifo flush\n");
>>> +            return -EIO;
>>> +        }
>>> +
>>> +        udelay(1);
>>> +    }
>> It'd be good to put a cpu_relax() in the busy loop.
> Will update in v2.
>>
>>> +static u32 tegra_qspi_setup_transfer_one(struct spi_device *spi,
>>> +                     struct spi_transfer *t,
>>> +                     bool is_first_of_msg)
>>> +{
>>> +        /* toggle cs to active state */
>>> +        if (spi->mode & SPI_CS_HIGH)
>>> +            command1 |= QSPI_CS_SW_VAL;
>>> +        else
>>> +            command1 &= ~QSPI_CS_SW_VAL;
>>> +        tegra_qspi_writel(tqspi, command1, QSPI_COMMAND1);
>> This is worrying, the client device might be confused if /CS is doing
>> things outside of the standard handling.
>
> Do you mean to honor spi_transfer cs_change flag?
>
> Tegra QSPI is master and is used only with QSPI flash devices. Looking 
> at SPI NOR driver, I see QSPI Flash commands are executed with one 
> flash command per spi_message and I dont see cs_change flag usage 
> w.r.t QSPI flash. So, using SW based CS control for QSPI.
>
> Please correct me if I miss something to understand here.
>
> Also Tegra186 and later QSPI controller supports combined sequence 
> mode where command, address, data phases can be combined in a single GO.
>
> This saves some cycles in transfer and for this we need to use SW 
> based CS control only.
>
>
>>> +    of_property_read_u32(slave_np, "nvidia,tx-clk-tap-delay",
>>> +                 &cdata->tx_clk_tap_delay);
>>> +    of_property_read_u32(slave_np, "nvidia,rx-clk-tap-delay",
>>> +                 &cdata->rx_clk_tap_delay);
>> These properties are not mentioned in the binding document.
> Thanks Mark. Missed them. Will add in v2.
>>
>>> +static int tegra_qspi_setup(struct spi_device *spi)
>>> +{
>>> +    if (cdata && cdata->tx_clk_tap_delay)
>>> +        tx_tap = cdata->tx_clk_tap_delay;
>>> +    if (cdata && cdata->rx_clk_tap_delay)
>>> +        rx_tap = cdata->rx_clk_tap_delay;
>>> +    tqspi->def_command2_reg = QSPI_TX_TAP_DELAY(tx_tap) |
>>> +                  QSPI_RX_TAP_DELAY(rx_tap);
>>> +    tegra_qspi_writel(tqspi, tqspi->def_command2_reg, QSPI_COMMAND2);
>> The setup for one device shouldn't be able to affect the operation of
>> another, already running, device so either these need to be configured
>> as part of the controller probe or these configurations need to be
>> deferred until we're actually doing a transfer.
> We will only have 1 device on QSPI as we only support single chip select.
>>
>>> +    /*
>>> +     * Tegra QSPI hardware support dummy bytes transfer based on the
>>> +     * programmed dummy clock cyles in QSPI register.
>>> +     * So, get the total dummy bytes from the dummy bytes transfer in
>>> +     * spi_messages and convert to dummy clock cyles.
>>> +     */
>>> +    list_for_each_entry(xfer, &msg->transfers, transfer_list) {
>>> +        if (ntransfers == DUMMY_BYTES_XFER &&
>>> +            !(list_is_last(&xfer->transfer_list, &msg->transfers)))
>>> +            dummy_cycles = xfer->len * 8 / xfer->tx_nbits;
>>> +        ntransfers++;
>>> +    }
>> This seems weird, there's some hard coded assumption about particular
>> patterns that the client device is going to send.  What's going on here?
>> I don't really understand what this is trying to do.
>
> QSPI flash needs dummy cycles for data read operation which is 
> actually the initial read latency and no. of dummy cycles required are 
> vendor specific.
>
> SPI NOR driver gets required dummy cycles based on mode clock cycles 
> and wait state clock cycles.
>
> During read operations, spi_nor_spimem_read_data() converts dummy 
> cycles to number of dummy bytes.
>
> Tegra QSPI controller supports dummy clock cycles register and when 
> programmed QSPI controller sends dummy bytes rather than SW handling 
> extra cycles for transferring dummy bytes.
>
> Above equation converts this dummy bytes back to dummy clock cycles to 
> program into QSPI register and avoid manual SW transfer of dummy bytes.
>
>>
>>> +static irqreturn_t tegra_qspi_isr(int irq, void *context_data)
>>> +{
>>> +    struct tegra_qspi_data *tqspi = context_data;
>>> +
>>> +    tqspi->status_reg = tegra_qspi_readl(tqspi, QSPI_FIFO_STATUS);
>>> +    if (tqspi->cur_direction & DATA_DIR_TX)
>>> +        tqspi->tx_status = tqspi->status_reg &
>>> +                   (QSPI_TX_FIFO_UNF | QSPI_TX_FIFO_OVF);
>>> +
>>> +    if (tqspi->cur_direction & DATA_DIR_RX)
>>> +        tqspi->rx_status = tqspi->status_reg &
>>> +                   (QSPI_RX_FIFO_OVF | QSPI_RX_FIFO_UNF);
>>> +    tegra_qspi_mask_clear_irq(tqspi);
>>> +
>>> +    return IRQ_WAKE_THREAD;
>>> +}
>> It's a bit unclear to me the value we gain from having this handler - if
>> we don't specify a handler genirq will already mask the interrupt until
>> we get to the thread anyway and we could just read the status in the
>> threaded handler.  OTOH it doesn't do any harm, just struck me as a bit
>> odd.
>
> I started QSPI driver by taking SPI driver as data transfer and 
> interrupt handling are similar.
>
> So kept this handler for clearing status registers and masking 
> interrupts as I did not see anything wrong with this.
>
>>
>>> +    master = spi_alloc_master(&pdev->dev, sizeof(*tqspi));
>>> +    if (!master) {
>>> +        dev_err(&pdev->dev, "master allocation failed\n");
>>> +        return -ENOMEM;
>>> +    }
>> Please switch to using the devm_ version of the API to allocate
>> controller, it makes things much more robust.
> Will update in v2
>>
>>> +    if (of_property_read_u32(pdev->dev.of_node, "spi-max-frequency",
>>> +                 &master->max_speed_hz))
>>> +        master->max_speed_hz = QSPI_MAX_SPEED;
>> The core will do this for you.
>
> Will remove this in v2.
>
> Thanks
>
> Sowjanya
>