linux-kernel - Re: [PATCH V17 2/7] soc: qcom: dcc: Add driver support for Data Capture and Compare unit(DCC)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <461b7484-b244-c66f-7a86-fadc5c7cabca@ieee.org>
Date:   Mon, 31 Oct 2022 13:51:30 -0500
From:   Alex Elder <elder@...e.org>
To:     Souradeep Chowdhury <quic_schowdhu@...cinc.com>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Konrad Dybcio <konrad.dybcio@...ainline.org>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-arm-msm@...r.kernel.org, devicetree@...r.kernel.org,
        Sai Prakash Ranjan <quic_saipraka@...cinc.com>,
        Sibi Sankar <quic_sibis@...cinc.com>,
        Rajendra Nayak <quic_rjendra@...cinc.com>, vkoul@...nel.org
Subject: Re: [PATCH V17 2/7] soc: qcom: dcc: Add driver support for Data
 Capture and Compare unit(DCC)

On 10/21/22 2:14 AM, Souradeep Chowdhury wrote:
> 
> 
> On 10/21/2022 5:37 AM, Alex Elder wrote:
>> On 10/14/22 1:00 AM, Souradeep Chowdhury wrote:
>>> The DCC is a DMA Engine designed to capture and store data
>>> during system crash or software triggers. The DCC operates
>>> based on user inputs via the debugfs interface. The user gives
>>> addresses as inputs and these addresses are stored in the
>>> dcc sram. In case of a system crash or a manual software
>>> trigger by the user through the debugfs interface,
>>> the dcc captures and stores the values at these addresses.
>>> This patch contains the driver which has all the methods
>>> pertaining to the debugfs interface, auxiliary functions to
>>> support all the four fundamental operations of dcc namely
>>> read, write, read/modify/write and loop. The probe method
>>> here instantiates all the resources necessary for dcc to
>>> operate mainly the dedicated dcc sram where it stores the
>>> values. The DCC driver can be used for debugging purposes
>>> without going for a reboot since it can perform software
>>> triggers as well based on user inputs.
>>>
>>> Also added the documentation for debugfs entries and explained
>>> the functionalities of each debugfs file that has been created
>>> for dcc.
>>>
>>> The following is the justification of using debugfs interface
>>> over the other alternatives like sysfs/ioctls
>>>
>>> i) As can be seen from the debugfs attribute descriptions,
>>> some of the debugfs attribute files here contains multiple
>>> arguments which needs to be accepted from the user. This goes
>>> against the design style of sysfs.
>>>
>>> ii) The user input patterns have been made simple and convenient
>>> in this case with the use of debugfs interface as user doesn't
>>> need to shuffle between different files to execute one instruction
>>> as was the case on using other alternatives.
>>>
>>> Signed-off-by: Souradeep Chowdhury <quic_schowdhu@...cinc.com>
>>
>> I haven't followed any review feedback you have received
>> since verion 8 (which I reviewed), so if I say something
>> that conflicts with other feedback I apologize.  I know
>> Bjorn had some comments too, so you're already going to
>> send another version.
>>
>> Unfortunately I have some more input, including some things
>> that are basically bugs (because buffers could be overrun).
>> I will plan to review again once you've had a chance to
>> address my comments.
>>
>>                      -Alex
> 
> Thanks for the review. Will be sending out the next version implementing 
> Bjorn's and your comments.

Sorry for my delayed response.  Your message didn't show up
in my "normal" mail box so I'm catching up now.

. . .

>>> diff --git a/drivers/soc/qcom/Makefile b/drivers/soc/qcom/Makefile
>>> index d66604a..b1fe812 100644
>>> --- a/drivers/soc/qcom/Makefile
>>> +++ b/drivers/soc/qcom/Makefile
>>> @@ -4,6 +4,7 @@ obj-$(CONFIG_QCOM_AOSS_QMP) +=    qcom_aoss.o
>>>   obj-$(CONFIG_QCOM_GENI_SE) +=    qcom-geni-se.o
>>>   obj-$(CONFIG_QCOM_COMMAND_DB) += cmd-db.o
>>>   obj-$(CONFIG_QCOM_CPR)        += cpr.o
>>> +obj-$(CONFIG_QCOM_DCC) += dcc.o
>>>   obj-$(CONFIG_QCOM_GSBI)    +=    qcom_gsbi.o
>>>   obj-$(CONFIG_QCOM_MDT_LOADER)    += mdt_loader.o
>>>   obj-$(CONFIG_QCOM_OCMEM)    += ocmem.o
>>> diff --git a/drivers/soc/qcom/dcc.c b/drivers/soc/qcom/dcc.c
>>> new file mode 100644
>>> index 0000000..efad225
>>> --- /dev/null
>>> +++ b/drivers/soc/qcom/dcc.c

. . .

>> Then you use DCC_ADDR_RANGE_MASK to truncate an address
>> provided down to a multiple of 16 bytes.  Why is that?
>> Is there a hardware limitation that makes 16 byte alignment
>> necessary?  (A little more below, where they're used.)
> 
> Yes,this is necessary as per dcc_sram hardware configuraton.

OK.  I assumed that, but it's worth mentioning that
somewhere (perhaps you already did, and I just missed it).

. . .

>> I have some questions about the way memory regions
>> are defined here.
>>
>> - You round down the address using DCC_ADDR_RANGE_MASK.
>>    Is that because the address has an alignment requirement?
>> - DCC_ADDR_RANGE_MASK is 0xfffffff0, meaning it's 16-byte
>>    aligned.  Is that the required alignment?  (It is more
>>    strict than the 32-bit word size.)
>> - Is there any requirement on the size (in bytes)?  I.e.,
>>    does it need to be 16-byte aligned?  (You multiply the
>>    count by 4, which I presume is sizeof(u32), the word size.)
>> - If the base address is affected by rounding down like
>>    this, you aren't updating the length, which it seems
>>    could omit a word at the end of the desired range.
>> - You are checking to be sure the word count doesn't exceed
>>    the RAM size.  But you're using DCC_ADDR_OFF_RANGE=8,
>>    even though you said that a "word" is 32 bits.
> 
> The check for the DCC_ADDR_OFF_RANGE=8 is to give an arbitrary
> restriction in word length for the dcc configuration but ideally it
> should be 4 as dcc sram word length is 4, will be changing this 
> accordingly.

I think that will be clearer.  Using the word length avoids
any need to explain why 8 was being used.

> Also the base address alignment requirement is consistent as per the
> DCC hardware specification. The address range has to be 16 byte
> aligned.

So you're saying the size in bytes also has this requirement?
If so, then it's good you'll enforce it.

>>
>>> +    if (!len || len > drvdata->ram_size / DCC_ADDR_OFF_RANGE) {
>>> +        dev_err(drvdata->dev, "DCC: Invalid length\n");
>>> +        ret = -EINVAL;
>>> +        goto out_unlock;
>>> +    }
>>> +
>>> +    base = addr & DCC_ADDR_RANGE_MASK;
>> Maybe:
>>      base = round_down(addr, DCC_WORD_SIZE);
>>
>> Then you don't even need DCC_ADDR_RANGE_MASK.
>>
>> And then:
>>      len += base - addr;
>> And if necessary:
>>      len = round_up(addr, DCC_WORD_SIZE);
>> And finally:
>>      if (len > drvdata->ram_size / DCC_WORD_SIZE)
>>          return -EINVAL;
> 
> Ack

. . .

>>> +    if (ret)
>>> +        return -EFAULT;
>>> +    if (count > sizeof(buf) || count == 0)
>>> +        return -EINVAL;
>>> +
>>> +    curr_list = dcc_filp_curr_list(filp);
>>> +    if (curr_list < 0)
>>> +        return curr_list;
>>> +
>>> +    if (buf[count - 1] == '\n')
>>> +        buf[count - 1] = '\0';
>>> +    else
>>> +        return -EINVAL;
>> Why is it important for the input buffer to end in newline?
> 
> We are using the newline to convert the input buffer into a string
> 
> for strsep operations.

But strsep() returns the entire string if it finds the '\0'
before finding any of the delimiters.  So the effect should
be the same.  It's possible I'm misunderstanding but I think
there's no need for this check at all.

>>> +    /* EOF check */
>>> +    if (*ppos >= drvdata->ram_size)
>>> +        return 0;
>>> +
>>> +    if ((*ppos + len) > drvdata->ram_size)
>>> +        len = (drvdata->ram_size - *ppos);
>>> +
>>> +    buf = kzalloc(len, GFP_KERNEL);
>>
>> Now that you are using memremap() rather than ioremap()
>> for the ram_base memory, I don't think you have any need
>> to allocate a buffer here anymore.
> 
> Ack. As per Bjorn's comments this should be ioremaped.

OK, sorry, I didn't notice that.

> Can you please clarify whether this should be mapped to
> 
> mem or ioremap?

The reason I suggested memremap() was that the region you
are mapping is being treated as a block of RAM.  Bjorn
might know something about this that I don't know...

Here's an early LWN article which (at the end) explains
why/when one might want to use memremap().
   https://lwn.net/Articles/653585/
Where I have used it, I pass MEMREMAP_WC as the flag.

>>
>>> +    if (!buf)
>>> +        return -ENOMEM;
>>> +
>>> +    memcpy(buf, drvdata->ram_base + *ppos, len);
>>
>> That is, you can simply copy_to_user() into the (user)
>> data pointer, from drvdata->ram_base + *ppos.  Maybe
>> something like:
>>
>>      void *src;
>>      /* ... */
>>
>>      src = drvdata->ram_base + *ppos;
>>      if (copy_to_user(data, src, len))
>>          return -EFAULT;
>>
> 
> Ack
> 
>>> +    if (copy_to_user(data, buf, len)) {
>>> +        kfree(buf);
>>> +        return -EFAULT;
>>> +    }
>>> +
>>> +    *ppos += len;
>>> +
>>> +    kfree(buf);
>>> +
>>> +    return len;
>>> +}
>>> +
>>> +static const struct file_operations dcc_sram_fops = {
>>> +    .owner        = THIS_MODULE,
>>> +    .read        = dcc_sram_read,
>>> +    .llseek        = no_llseek,
>>> +};
>>> +
>>> +static int dcc_sram_dev_init(struct dcc_drvdata *drvdata)
>>> +{
>>> +    drvdata->sram_dev.minor = MISC_DYNAMIC_MINOR;
>>> +    drvdata->sram_dev.name = DCC_SRAM_NODE;
>>> +    drvdata->sram_dev.fops = &dcc_sram_fops;
>>> +
>>> +    return misc_register(&drvdata->sram_dev);
>>> +}
>>> +
>>> +static void dcc_sram_dev_exit(struct dcc_drvdata *drvdata)
>>> +{
>>> +    misc_deregister(&drvdata->sram_dev);
>>> +}
>>> +
>>> +static int dcc_probe(struct platform_device *pdev)
>>> +{
>>> +    u32 val;
>>> +    int ret = 0, i;
>>> +    struct device *dev = &pdev->dev;
>>> +    struct dcc_drvdata *dcc;
>>
>> Why do you use "dcc" here and "drvdata" elsewhere?
> 
> This was renamed in probe as per prior review comment.

I don't know who suggested that (maybe me?), but I guess I
prefer using the same (base) name for variables of a given
type.  So if you call it "dcc" here, then maybe call it
"dcc" everywhere.

I haven't looked closely at your patch just now, but it's
possible the "struct dcc_drvdata" type could simply be
"struct dcc".  That is, a "dcc" structure represents a
single "dcc" instance, and you happen to store a copy of
that "dcc" pointer as the device's drvdata.

Something for you to consider, but this isn't as important
a suggestion as a few other comments I've made.

>>> +    struct resource *res;
>>> +
>>> +    dcc = devm_kzalloc(dev, sizeof(*dcc), GFP_KERNEL);
>>> +    if (!dcc)
>>> +        return -ENOMEM;
>>> +
>>> +    dcc->dev = &pdev->dev;
>>> +    platform_set_drvdata(pdev, dcc);
>>> +
>>> +    dcc->base = devm_platform_ioremap_resource(pdev, 0);
>>> +    if (IS_ERR(dcc->base))
>>> +        return PTR_ERR(dcc->base);
>>> +
>>> +    res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>>> +    if (!res)
>>> +        return -ENODEV;
>>> +
>>> +    dcc->ram_base = memremap(res->start, resource_size(res), 
>>> MEMREMAP_WB);
>>> +    if (!dcc->ram_base)
>>> +        return -ENODEV;

. . .

>>> +    /* Either set the fixed loop offset or calculate it
>>> +     * from ram_size. Max consecutive addresses the
>>> +     * dcc can loop is equivalent to the ram size
>>> +     */
>>> +    if (val & DCC_LOOP_OFFSET_MASK)
>>> +        dcc->loopoff = DCC_FIX_LOOP_OFFSET;
>>> +    else
>>> +        dcc->loopoff = get_bitmask_order((dcc->ram_size +
>>> +                dcc->ram_offset) / 4 - 1);
>>
>> Here's what I said about the above last time:
>>
>>    This get_bitmask_order() call to determine the offset of a
>>    register seems overly clever.  I think it warrants a little
>>    explanation why it's determined by the size of SRAM.
>>
>> I think part of what confuses me is why you use the sum
>> of ram_size and ram_offset.  I suppose 4 is DCC_WORD_SIZE
>> but I just don't know.  The comment I was suggesting was
>> something about what loopoff actually represents, and why
>> it's calculated this way.
> 
> As mentioned in the comment above, the loopoff stands for the max
> 
> consecutive addresses that can be given to the loop instruction. We
> 
> are restricting it as per the total words that can be accomodated in
> 
> the dcc_sram.

So you're taking the ram_size + ram_offset, which is the
the address just beyond the end of RAM.  (Right?)

Then you divide it by 4 (because 4 is the size of a "word"?).
To the result would be the end of RAM expressed as "words".

Then you subtract 1, which means "last word within RAM".

I think there are two things I find confusing:
- Why do you use ram_size + ram_offset?  The comment you
   added even says "Max consecutive addresses the dcc can
   loop is equivalent to the ram size", and that sounds
   like the loop_offset calculation should be working
   *only* with ram_size.
- You call get_bitmask_order() on this value, and I just
   don't see how that is related to a loop offset.

(Again, I'm not looking closely at the code right now, so
maybe I'm just forgetting something about the way this memory
is laid out.)

Thanks.

					-Alex