lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 Sep 2023 11:49:52 +0300
From:   Ivaylo Dimitrov <ivo.g.dimitrov.75@...il.com>
To:     Sean Young <sean@...s.org>
Cc:     linux-media@...r.kernel.org, Tony Lindgren <tony@...mide.com>,
        Russell King <linux@...linux.org.uk>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Thierry Reding <thierry.reding@...il.com>,
        Uwe Kleine-König <u.kleine-koenig@...gutronix.de>,
        Timo Kokkonen <timo.t.kokkonen@....fi>,
        Pali Rohár <pali.rohar@...il.com>,
        "Sicelo A . Mhlongo" <absicsz@...il.com>,
        linux-omap@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, linux-pwm@...r.kernel.org
Subject: Re: [PATCH v5 2/2] media: rc: remove ir-rx51 in favour of generic
 pwm-ir-tx

Hi,

On 26.09.23 г. 23:18 ч., Sean Young wrote:
> On Tue, Sep 26, 2023 at 03:43:18PM +0300, Ivaylo Dimitrov wrote:
>> On 26.09.23 г. 10:16 ч., Sean Young wrote:
>>> On Mon, Sep 25, 2023 at 07:06:44PM +0300, Ivaylo Dimitrov wrote:
>>>> On 1.09.23 г. 17:18 ч., Sean Young wrote:
>>>>> The ir-rx51 is a pwm-based TX driver specific to the N900. This can be
>>>>> handled entirely by the generic pwm-ir-tx driver, and in fact the
>>>>> pwm-ir-tx driver has been compatible with ir-rx51 from the start.
>>>>>
>>>>
>>>> Unfortunately, pwm-ir-tx does not work on n900. My investigation shows that
>>>> for some reason usleep_range() sleeps for at least 300-400 us more than what
>>>> interval it is requested to sleep. I played with cyclictest from rt-tests
>>>> package and it gives similar results - increasing the priority helps, but I
>>>> was not able to make it sleep for less that 300 us in average. I tried
>>>> cpu_latency_qos_add_request() in pwm-ir-tx, but it made no difference.
>>>>
>>>> I get similar results on motorola droid4 (OMAP4), albeit there average sleep
>>>> is in 200-300 us range, which makes me believe that either OMAPs have issues
>>>> with hrtimers or the config we use has some issue which leads to scheduler
>>>> latency. Or, something else...
>>>
>>> The pwm-ir-tx driver does suffer from this problem, but I was under the
>>> impression that the ir-rx51 has the same problem.
>>>
>>
>> Could you elaborate on the "pwm-ir-tx driver does suffer from this problem"?
>> Where do you see that?
> 
> So on a raspberry pi (model 3b), if I use the pwm-ir-tx driver, I get random
> delays of up to 100us. It's a bit random and certainly depends on the load.
> 
> I'm measuring using a logic analyzer.
> 
> There have been reports by others on different machines with random delays
> and/or transmit failures (as in the receiver occassionally fails to decode
> the IR). I usually suggest they use the gpio-ir-tx driver, which does work
> as far as I know (the signal looks perfect with a logic analyzer).
> 
> So far I've taken the view that the driver works ok for most situations,
> since IR is usually fine with upto 100us missing here or there.
> 
> The gpio-ir-tx driver works much better because it does the entire send
> under spinlock - obviously that has its own problems, because an IR transmit
> can be 10s or even 100s of milliseconds.
> 
> I've never known of a solution to the pwm-ir-tx driver. If using hrtimers
> directly improves the situation even a bit, then that would be great.
> 

The issue with hrtimers is that we cannot use them directly, as 
pwm_apply_state() may sleep, but hrtimer function is called in atomic 
context.

>> ir-rx51 does not suffer from the same problem (albeit it has its own one,
>> see bellow)
>>
>>>> In either case help is appreciated to dig further trying to find the reason
>>>> for such a big delay.
>>>
>>> pwm-ir-tx uses usleep_range() and ir-rx51 uses hrtimers. I thought that
>>> usleep_range() uses hrtimers; however if you're not seeing the same delay
>>> on ir-rx51 then maybe it's time to switch pwm-ir-tx to hrtimers.
>>>
>>
>> usleep_range() is backed by hrtimers already, however the difference comes
>> from how hrtimer is used in ir-rx51: it uses timer callback function that
>> gets called in softirq context, while usleep_range() puts the task in
>> TASK_UNINTERRUPTIBLE state and then calls schedule_hrtimeout_range(). For
>> some reason it takes at least 200-400 us (on average) even on OMAP4 to
>> switch back to TASK_RUNNING state.
>>
>> The issue with ir-rx51 and the way it uses hrtimers is that it calls
>> pwm_apply_state() from hrtimer function, which is not ok, per the comment
>> here
>> https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/pwm/core.c#L502
>>
>> I can make pwm-ir-tx switch to hrtimers, that's not an issue, but I am
>> afraid that there is some general scheduler or timers (or something else)
>> issue that manifests itself with usleep_range() misbehaving.
> 
> If we can switch pwm-ir-tx to hrtimers, that would be great.
> 

I made some POC here, but unfortunately it failed more or less. The idea 
of POC is: setup hrtimer, start it in pwm_ir_tx() and do 
wait_for_completion() in a loop while calling complete() for the timer 
function. While it improves things a bit, I wouldn't say it makes the 
driver working ok on n900 - my TV registers one of let's say 5-10 pulse 
packages.

We have couple of issues:

- scheduler seems to use 32kHz timer, which means that we can never have 
precise pulse width, with error up to ~30 us, no matter what we do, IIUC.

- wait_for_completion() suffers from the same latency issue that 
usleep_range() has - it exits after 300-400 us after complete() has been 
called in the timer function.

- turning pwm off needs ~300us, because of either omap_dm_timer_stop() 
calling clk_get_rate() or __omap_dm_timer_stop() waiting for fclk period 
* 3.5 (see 
https://elixir.bootlin.com/linux/v6.6-rc3/source/drivers/clocksource/timer-ti-dm.c#L269)

- in order to achieve some sane latency distribution, I have to 
set_user_nice(current, MIN_NICE); in pwm_ir_tx()


> The ir-rx51 removal patches have already been queued to media_staging;
> we may have to remove them from there if we can't solve this problem.
> 

ir-rx51 has conceptual problem of calling function that might sleep from 
atomic context, however, we can fix omap_dm_timer_stop() to not call 
clk_get_rate() and that would make it working. So yeah, if we can't fix 
pwm-ir-tx then patches removal along with fixing dmtimer and fixing a 
couple of code issues ir-rx51 has, seems the only option to have working 
IRTX on n900. Maybe we can rename it to pwm-ir-tx-hrtimer as there is 
nothing n900 specific in it.

>>> I don't have a n900 to test on, unfortunately.
>>>
>>
>> I have and once I have an idea what's going on will port pwm-ir-tx to
>> hrtimers, if needed. Don't want to do it now as I am afraid the completion I
>> will have to use will have the same latency problems as usleep_range()
> 
> That would be fantastic. Please do keep us up to date with how you are
> getting on. Like I said, it would be nice to this resolved before the next
> merge window.
> 

The only thing I didn't try yet is to start another thread and to set 
that thread to use FIFO scheduler. I will report once I have tried that.

Regards,
Ivo

> Thanks,
> Sean
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ