linux-kernel - Re: [PATCH] iio: dht11: Read bit stream from IRQ on falling edges only

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d96a7c1f-4a12-18c9-377d-df69b17168d2@feldner-bv.de>
Date:   Sat, 11 Feb 2023 11:41:32 +0100
From:   pelzi@...ing-snail.de
To:     harald@...ib.org
Cc:     Jonathan Cameron <jic23@...nel.org>,
        Lars-Peter Clausen <lars@...afoo.de>,
        linux-iio@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] iio: dht11: Read bit stream from IRQ on falling edges
 only

Am 07.02.23 um 11:33 schrieb harald@...ib.org:

> Thanks, these are indeed interresting results. If you want to move this
> forward, the next steps would be:
> 1) Sharing your test script - yes it's trivial, but still ...

That appears fairly easy, you find the script here:

https://github.com/pelzvieh/banana_resources/tree/main/test_dht11

> 2) A theoretical analysis about possible regressions depending on timer
> resolution as mentioned in an earlier message.

This sounds as if you were doing such an analysis for the original
version. Can you share this work so I can attempt to repeat it
for the modified algorithm?

> 3) Ideally figuring out, why your version performs better then what we
> currently have. I have some suspicions, but better understanding might
> lead to a better approach. E.g. maybe recording the other edges isn't
> the problem so long as we ignore them during decoding?
>
> As I see it, the main thing we are losing with your current proposal is
> some diagnostic features. If we keep them as much as possible and have
> regressions understood and covered, I see no reason to reject your idea.

That's why I changed the script to separately count EIO and ETIMEDOUT.
The latter indicates missed edges, the former failure to interpret
the data read.

What I see is that the patched driver's errors mostly result from missed
IRQ (note in contrast to last results, I cut the number of reads):

#    real[s]    user[s]    sys[s]    success    EIO    timeout err per succ
1     20.57    0.25    0.03    10    0    0    0
2     24.74    0.25    0.07    10    0    4    0,4
3     21.55    0.20    0.07    10    0    0    0
4     25.81    0.25    0.08    10    0    5    0,5
5     21.56    0.23    0.05    10    0    0    0
6     21.58    0.22    0.05    10    1    0    0,1
7     25.86    0.24    0.08    10    1    5    0,6
8     22.69    0.27    0.05    10    1    1    0,2
9     23.67    0.26    0.04    10    0    2    0,2
10     20.55    0.23    0.04    10    0    0    0

Whereas the original driver has more errors resulting from
mis-interpreted data:

#    real[s]    user[s]    sys[s]    success    EIO    timeout err per succ
1     24.88    0.26    0.07    10    5    4    0,9
2     25.91    0.26    0.07    10    4    5    0,9
3     31.27    0.31    0.10    10    6    10    1,6
4     29.17    0.32    0.11    10    7    8    1,5
5     22.73    0.24    0.08    10    4    2    0,6
6     46.46    0.35    0.25    10    19    24    4,3
7     23.79    0.23    0.09    10    3    3    0,6
8     30.17    0.27    0.11    10    6    9    1,5
9     23.77    0.26    0.06    10    3    2    0,5
10     20.58    0.24    0.06    10    1    0    0,1

I tried a variant that reads falling and rising edges and
uses the redundany of information to eliminate some errors.
This did not work out at all. It seems a relevant source of
trouble is delayed call to the IRQ handler. The problem is
that only then you try to find out if this IRQ is due to
rising or falling edge by reading the current GPIO level. When
you are to late, this might already have changed and you read
a level, but for the edge of _this_ level you'll receive another
IRQ a few us later.

So the reason that this patch here is showing
lower error rates seems to be the lower probability of such
things happening by halving the IRQs to be handled, _plus_
the information from the hardware, that this IRQ was due
to a falling edge.

Yours,

Andreas.