linux-kernel - Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <72869c32-c2c5-54f9-10d9-8c0ed9f6300d@linaro.org>
Date:   Wed, 15 May 2019 07:13:09 -0500
From:   Alex Elder <elder@...aro.org>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     David Miller <davem@...emloft.net>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        syadagir@...eaurora.org, mjavid@...eaurora.org,
        evgreen@...omium.org, benchan@...gle.com, ejcaruso@...gle.com,
        abhishek.esse@...il.com,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 08/18] soc: qcom: ipa: the generic software interface

On 5/15/19 2:21 AM, Arnd Bergmann wrote:
> On Sun, May 12, 2019 at 3:25 AM Alex Elder <elder@...aro.org> wrote:
> 
>> +/** gsi_gpi_channel_scratch - GPI protocol scratch register
>> + *
>> + * @max_outstanding_tre:
>> + *     Defines the maximum number of TREs allowed in a single transaction
>> + *     on a channel (in Bytes).  This determines the amount of prefetch
>> + *     performed by the hardware.  We configure this to equal the size of
>> + *     the TLV FIFO for the channel.
>> + * @outstanding_threshold:
>> + *     Defines the threshold (in Bytes) determining when the sequencer
>> + *     should update the channel doorbell.  We configure this to equal
>> + *     the size of two TREs.
>> + */
>> +struct gsi_gpi_channel_scratch {
>> +       u64 rsvd1;
>> +       u16 rsvd2;
>> +       u16 max_outstanding_tre;
>> +       u16 rsvd3;
>> +       u16 outstanding_threshold;
>> +} __packed;
>> +
>> +/** gsi_channel_scratch - channel scratch configuration area
>> + *
>> + * The exact interpretation of this register is protocol-specific.
>> + * We only use GPI channels; see struct gsi_gpi_channel_scratch, above.
>> + */
>> +union gsi_channel_scratch {
>> +       struct gsi_gpi_channel_scratch gpi;
>> +       struct {
>> +               u32 word1;
>> +               u32 word2;
>> +               u32 word3;
>> +               u32 word4;
>> +       } data;
>> +} __packed;
> 
> What are the exact alignment requirements on these structures,
> do you ever need to have them on odd addresses? If not, please
> remove the __packed, or add __aligned() with the actual alignment,
> e.g. __aligned(4), to let the compiler create better code and
> avoid bytewise accesses.

Honestly I don't know but I would guess they've actually
got alignment requirements consistent with C standard...
Many, many structures had the __packed attribute attached
in the original code.  I removed most but apparently not
all.  I will remove the __packed here, and will scan through
the rest of the code for other similar instances and will
remove those if appropriate as well.

>> +/* Init function for GSI.  GSI hardware does not need to be "ready" */
>> +int gsi_init(struct gsi *gsi, struct platform_device *pdev, u32 data_count,
>> +            const struct gsi_ipa_endpoint_data *data)
>> +{
>> +       struct resource *res;
>> +       resource_size_t size;
>> +       unsigned int irq;
>> +       int ret;
>> +
>> +       gsi->dev = &pdev->dev;
>> +       init_dummy_netdev(&gsi->dummy_dev);
> 
> Can you add a comment here to explain what the 'dummy' device is
> needed for?

Yes, good idea.

FYI it's needed because the GSI code is not a "real"
network device (that, where needed, is implemented in
"ipa_netdev.c", two logical layers up), but in order
to use NAPI there needs to be one.


>> +       /* Get GSI memory range and map it */
>> +       res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gsi");
>> +       if (!res)
>> +               return -ENXIO;
>> +
>> +       size = resource_size(res);
>> +       if (res->start > U32_MAX || size > U32_MAX - res->start)
>> +               return -EINVAL;
>> +
>> +       gsi->virt = ioremap_nocache(res->start, size);
>> +       if (!gsi->virt)
>> +               return -ENOMEM;
> 
> The _nocache() postfix is not needed here, and I find it a bit
> confusing, just use plain ioremap, or maybe even
> devm_platform_ioremap_resource() to save the
> platform_get_resource_byname().

OK good idea.  This was in the original code and I neglected
to chase this down.  Thank you for catching it.

>> +       ret = request_irq(irq, gsi_isr, 0, "gsi", gsi);
>> +       if (ret)
>> +               goto err_unmap_virt;
>> +       gsi->irq = irq;
>> +
>> +       ret = enable_irq_wake(gsi->irq);
>> +       if (ret)
>> +               dev_err(gsi->dev, "error %d enabling gsi wake irq\n", ret);
>> +       gsi->irq_wake_enabled = ret ? 0 : 1;
>> +
>> +       spin_lock_init(&gsi->spinlock);
>> +       mutex_init(&gsi->mutex);
> 
> This looks a bit dangerous if you can ever get to the point of
> having a pending interrupt. before the structure is fully initialized.
> This can probably not happen in practice, but it's better to request
> the interrupts last to be on the safe side.

Understood.  I'll fix that.

>> +/* Wait for all transaction activity on a channel to complete */
>> +void gsi_channel_trans_quiesce(struct gsi *gsi, u32 channel_id)
>> +{
>> +       struct gsi_channel *channel = &gsi->channel[channel_id];
>> +       struct gsi_trans_info *trans_info;
>> +       struct gsi_trans *trans = NULL;
>> +       struct gsi_evt_ring *evt_ring;
>> +       struct list_head *list;
>> +       unsigned long flags;
>> +
>> +       trans_info = &channel->trans_info;
>> +       evt_ring = &channel->gsi->evt_ring[channel->evt_ring_id];
>> +
>> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> +       /* Find the last list to which a transaction was added */
>> +       if (!list_empty(&trans_info->alloc))
>> +               list = &trans_info->alloc;
>> +       else if (!list_empty(&trans_info->pending))
>> +               list = &trans_info->pending;
>> +       else if (!list_empty(&trans_info->complete))
>> +               list = &trans_info->complete;
>> +       else if (!list_empty(&trans_info->polled))
>> +               list = &trans_info->polled;
>> +       else
>> +               list = NULL;
>> +
>> +       if (list) {
>> +               struct gsi_trans *trans;
>> +
>> +               /* The last entry on this list is the last one allocated.
>> +                * Grab a reference so we can wait for it.
>> +                */
>> +               trans = list_last_entry(list, struct gsi_trans, links);
>> +               refcount_inc(&trans->refcount);
>> +       }
>> +
>> +       spin_lock_irqsave(&evt_ring->ring.spinlock, flags);
>> +
>> +       /* If there is one, wait for it to complete */
>> +       if (trans) {
>> +               wait_for_completion(&trans->completion);
> 
> Since you are waiting here, you clearly can't be called
> from interrupt context, or with interrupts disabled, so it's
> clearer to use spin_lock_irq() instead of spin_lock_irqsave().
> 
> I generally try to avoid the _irqsave versions altogether, unless
> it is really needed for a function that is called both from
> irq-disabled and irq-enabled context.

OK.  And I appreciate what your saying here because I do prefer
code that communicates more about the context in ways like
you describe.

Thanks you.

					-Alex

> 
>      Arnd
>