lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PH0PR03MB6366CFFFF5846F7018FFA03699E39@PH0PR03MB6366.namprd03.prod.outlook.com>
Date:   Wed, 21 Jul 2021 06:47:01 +0000
From:   "Sa, Nuno" <Nuno.Sa@...log.com>
To:     Mark Brown <broonie@...nel.org>,
        "Tachici, Alexandru" <Alexandru.Tachici@...log.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-spi@...r.kernel.org" <linux-spi@...r.kernel.org>
CC:     "nsaenz@...nel.org" <nsaenz@...nel.org>,
        "f.fainelli@...il.com" <f.fainelli@...il.com>,
        "rjui@...adcom.com" <rjui@...adcom.com>,
        "swarren@...dotorg.org" <swarren@...dotorg.org>,
        "bcm-kernel-feedback-list@...adcom.com" 
        <bcm-kernel-feedback-list@...adcom.com>,
        "bootc@...tc.net" <bootc@...tc.net>
Subject: RE: [PATCH 0/1] spi: spi-bcm2835: Fix deadlock

Hi all,

> From: Mark Brown <broonie@...nel.org>
> Sent: Tuesday, July 20, 2021 8:48 PM
> To: Tachici, Alexandru <Alexandru.Tachici@...log.com>; linux-
> kernel@...r.kernel.org; linux-spi@...r.kernel.org
> Cc: Mark Brown <broonie@...nel.org>; nsaenz@...nel.org;
> f.fainelli@...il.com; rjui@...adcom.com; swarren@...dotorg.org;
> bcm-kernel-feedback-list@...adcom.com; bootc@...tc.net; Sa,
> Nuno <Nuno.Sa@...log.com>
> Subject: Re: [PATCH 0/1] spi: spi-bcm2835: Fix deadlock
> 
> On Sat, 17 Jul 2021 00:02:44 +0300, alexandru.tachici@...log.com
> wrote:
> > The bcm2835_spi_transfer_one function can create a deadlock
> > if it is called while another thread already has the
> > CCF lock.
> >
> > This behavior was observed at boot and when trying to
> > print the clk_summary debugfs. I had registered
> > at the time multiple clocks of AD9545 through the CCF.
> > Tested this using an RPi 4 connected to AD9545 through SPI.
> >
> > [...]
> 
> Applied to
> 
> 
> https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/k
> ernel/git/broonie/spi.git__;!!A3Ni8CS0y2Y!sBvE9XdQTgcPnOamJTAcY8
> 6Pjg5Cv-t1aDGASU9IO-JQeIPDBf5TBud6qV26eQ$  for-next
> 
> Thanks!
> 
> [1/1] spi: spi-bcm2835: Fix deadlock
>       commit: c45c1e82bba130db4f19d9dbc1deefcf4ea994ed
> 
> All being well this means that it will be integrated into the linux-next
> tree (usually sometime in the next 24 hours) and sent to Linus during
> the next merge window (or sooner if it is a bug fix), however if
> problems are discovered then the patch may be dropped or reverted.
> 
> You may get further e-mails resulting from automated or manual
> testing
> and review of the tree, please engage with people reporting problems
> and
> send followup patches addressing any issues that are reported if
> needed.
> 
> If any updates are required or you are submitting further changes they
> should be sent as incremental updates against current git, existing
> patches will not be replaced.
> 
> Please add any relevant lists and maintainers to the CCs when replying
> to this mail.
> 
> Thanks,
> Mark

I'm really curious about this one and how should we proceed. Maybe this is not
new (just to me) and the way to go is just to "fix" the spi controller when we hit the
issue? I'm asking this because there's a more fundamental problem when this pieces
align together (CCF + SPI). What I mean is that this can potentially happen in every
system that happens to have a spi based clock provider and in which the spi controller
tries to access the CCF in the spi transfer function... Doing a quick and short look I can
already see that [1], [2], [3] and [4] could hit the same deadlock...


Honestly, I'm not sure what is the fix here since when we look individually at the pieces
(CCF, SPI, SPI controller) there's nothing really wrong. The problem is when combined
together... My naive thinking is that having something like 'spi_sync_nodefer();' would
be a way to prevent this (or just changing 'spi_sync()' so that it can never defer the
msg to the spi thread).

Looking alone to ' __spi_pump_messages()' I can see that this probably not trivial though...

[1]: https://elixir.bootlin.com/linux/v5.14-rc2/source/drivers/spi/spi-tegra20-slink.c#L686
[2]: https://elixir.bootlin.com/linux/latest/source/drivers/spi/spi-sun6i.c#L353
[3]: https://elixir.bootlin.com/linux/latest/source/drivers/spi/spi-sun4i.c#L271
[4]: https://elixir.bootlin.com/linux/latest/source/drivers/spi/spi-qcom-qspi.c#L237

- Nuno Sá

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ