linux-kernel - Re: [PATCH] mailbox: forward the hrtimer if not queued and under a lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e3abb8c0-a42c-4eea-993e-3c8fcce0ae64@axis.com>
Date:   Mon, 23 May 2022 13:56:15 +0200
From:   Bjorn Ardo <bjorn.ardo@...s.com>
To:     Jassi Brar <jassisinghbrar@...il.com>
CC:     kernel <kernel@...s.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mailbox: forward the hrtimer if not queued and under a
 lock

Hi again,


On 4/20/22 10:28, Bjorn Ardo wrote:
>
>
> Our current solution are using 4 different mailbox channels 
> asynchronously. The code is part of a larger system, but I can put 
> down some time and try and extract the relevant parts if you still 
> think this is a client issue? But with my current understanding of the 
> code, the race between msg_submit() and txdone_hrtimer() is quite 
> clear, and with my proposed patch that removes this race we have be 
> able to run for very long time without any problems (that is several 
> days). Without the fix we get the race after 5-10 min.
>
>
>

I do not know if you have had any time to review my comments yet, but we 
have created some examples to trigger the error.


With the attached testmodule mailbox-loadtest.c I can trigger the error 
by attaching it to the two sides of an mailbox with the following 
devicetree code:

         mboxtest1 {
                 compatible = "mailbox-loadtest";
                 mbox-names = "ping", "pong";
                 mboxes = <&mbox_loop_pri 0 &mbox_loop_pri 1>;
         };

         mboxtest2 {
                 compatible = "mailbox-loadtest";
                 mbox-names = "pong", "ping";
                 mboxes = <&mbox_loop_scd 0 &mbox_loop_scd 1>;
         };


After that I create load on the mailbox by running (or respectively 
system) the following:

while echo 1 > /sys/kernel/debug/mboxtest1/ping ; do
usleep 1
done

while echo 1 > /sys/kernel/debug/mboxtest2/ping ; do
usleep 50000
done

After a few minutes (normally 2-5) I get errors.


Using the patch I sent earlier the errors goes away.


We also have created a mailbox-loopback.c that is a loopback mailbox 
that can be used on the same system (to make testing easier on systems 
that does not have a hardware mailbox), it is also attached. This can be 
probed by the following devicetree code:

         mbox_loop_pri: mailbox_loop_pri {
                 compatible = "mailbox-loopback";
                 #mbox-cells = <1>;
                 side = <0>;
         };
         mbox_loop_scd: mailbox_loop_scd {
                 compatible = "mailbox-loopback";
                 #mbox-cells = <1>;
                 side = <1>;
         };

And with this loopback mailbox we have also been able to reproduce the 
errors without the patch applied.


Best Regards,

Björn


View attachment "mailbox-loadtest.c" of type "text/x-csrc" (5221 bytes)

View attachment "mailbox-loopback.c" of type "text/x-csrc" (4275 bytes)