lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1jimbak0hh.fsf@starbuckisacylon.baylibre.com>
Date:   Fri, 16 Oct 2020 11:02:02 +0200
From:   Jerome Brunet <jbrunet@...libre.com>
To:     Sudeep Holla <sudeep.holla@....com>,
        Ionela Voinescu <ionela.voinescu@....com>
Cc:     Jassi Brar <jassisinghbrar@...il.com>,
        Kevin Hilman <khilman@...libre.com>,
        linux-amlogic@...ts.infradead.org, Da Xue <da@...re.computer>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mailbox: cancel timer before starting it


On Fri 16 Oct 2020 at 10:44, Sudeep Holla <sudeep.holla@....com> wrote:

> On Thu, Oct 15, 2020 at 03:29:35PM +0100, Ionela Voinescu wrote:
>> Hi Jerome,
>> 
>> On Thursday 15 Oct 2020 at 15:58:30 (+0200), Jerome Brunet wrote:
>> > 
>> > On Thu 15 Oct 2020 at 15:46, Ionela Voinescu <ionela.voinescu@....com> wrote:
>> > 
>> > > Hi guys,
>> > >
>> > > On Wednesday 23 Sep 2020 at 14:39:16 (+0200), Jerome Brunet wrote:
>> > >> If the txdone is done by polling, it is possible for msg_submit() to start
>> > >> the timer while txdone_hrtimer() callback is running. If the timer needs
>> > >> recheduling, it could already be enqueued by the time hrtimer_forward_now()
>> > >> is called, leading hrtimer to loudly complain.
>> > >> 
>> > >> WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110
>> > >> CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5
>> > >> Hardware name: Libre Computer AML-S805X-AC (DT)
>> > >> Workqueue: events_freezable_power_ thermal_zone_device_check
>> > >> pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--)
>> > >> pc : hrtimer_forward+0xc4/0x110
>> > >> lr : txdone_hrtimer+0xf8/0x118
>> > >> [...]
>> > >> 
>> > >> Canceling the timer before starting it ensure that the timer callback is
>> > >> not running when the timer is started, solving this race condition.
>> > >> 
>> > >> Fixes: 0cc67945ea59 ("mailbox: switch to hrtimer for tx_complete polling")
>> > >> Reported-by: Da Xue <da@...re.computer>
>> > >> Signed-off-by: Jerome Brunet <jbrunet@...libre.com>
>> > >> ---
>> > >>  drivers/mailbox/mailbox.c | 8 ++++++--
>> > >>  1 file changed, 6 insertions(+), 2 deletions(-)
>> > >> 
>> > >> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
>> > >> index 0b821a5b2db8..34f9ab01caef 100644
>> > >> --- a/drivers/mailbox/mailbox.c
>> > >> +++ b/drivers/mailbox/mailbox.c
>> > >> @@ -82,9 +82,13 @@ static void msg_submit(struct mbox_chan *chan)
>> > >>  exit:
>> > >>  	spin_unlock_irqrestore(&chan->lock, flags);
>> > >>  
>> > >> -	if (!err && (chan->txdone_method & TXDONE_BY_POLL))
>> > >> -		/* kick start the timer immediately to avoid delays */
>> > >> +	if (!err && (chan->txdone_method & TXDONE_BY_POLL)) {
>> > >> +		/* Disable the timer if already active ... */
>> > >> +		hrtimer_cancel(&chan->mbox->poll_hrt);
>> > >> +
>> > >> +		/* ... and kick start it immediately to avoid delays */
>> > >>  		hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL);
>> > >> +	}
>> > >>  }
>> > >>  
>> > >>  static void tx_tick(struct mbox_chan *chan, int r)
>> > >
>> > > I've tracked a regression back to this commit. Details to reproduce:
>> > 
>> > Hi Ionela,
>> > 
>> > I don't have access to your platform and I don't get what is going on
>> > from the log below.
>> > 
>> > Could you please give us a bit more details about what is going on ?
>> > 
>> 
>> I'm not familiar with the mailbox subsystem, so the best I can do right
>> now is to add Sudeep to Cc, in case this conflicts in some way with the
>> ARM MHU patches [1].
>>
>
> Not it can't be doorbell driver as we use SCPI(old firmware) with upstream
> MHU driver as is limiting the number of channels to be used.
>
>> In the meantime I'll get some traces and get more familiar with the
>> code.
>>
>
> I will try that too.

BTW, this issue was originally reported on amlogic platforms which also
use arm,mhu mailbox driver.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ