lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6CKGu7URC1iGOVO@smile.fi.intel.com>
Date: Mon, 3 Feb 2025 11:19:22 +0200
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
Cc: Marek Szyprowski <m.szyprowski@...sung.com>,
	Mark Brown <broonie@...nel.org>, linux-kernel@...r.kernel.org,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"Rafael J. Wysocki" <rafael@...nel.org>,
	Danilo Krummrich <dakr@...nel.org>,
	DRI mailing list <dri-devel@...ts.freedesktop.org>
Subject: Re: [PATCH v3 1/1] regmap: Synchronize cache for the page selector

On Sat, Feb 01, 2025 at 07:18:56PM +0200, Dmitry Baryshkov wrote:
> On Wed, Jan 29, 2025 at 05:07:52PM +0200, Andy Shevchenko wrote:
> > On Tue, Jan 28, 2025 at 06:43:26PM +0200, Andy Shevchenko wrote:
> > > On Tue, Jan 28, 2025 at 05:08:08PM +0100, Marek Szyprowski wrote:
> > > > On 21.01.2025 14:29, Andy Shevchenko wrote:
> > > > > On Tue, Jan 21, 2025 at 08:33:09AM +0100, Marek Szyprowski wrote:
> > > > >> On 17.01.2025 18:28, Andy Shevchenko wrote:
> > > > >>> On Fri, Jan 17, 2025 at 05:05:42PM +0100, Marek Szyprowski wrote:
> > > > >>>
> > > > >>> Does it fail in the same way?
> > > > >> Yes, the hw revision is reported as zero in this case: LT9611 revision:
> > > > >> 0x00.00.00
> > > > > Hmm... This is very interesting! It means that the page selector is a bit
> > > > > magical there. Dmitry, can you chime in and perhaps shed some light on this?
> > > > >
> > > > >>>> Does it mean that there is really a bug in the driver?
> > > > >>> Without looking at the datasheet it's hard to say. At least what I found so far
> > > > >>> is one page of the I²C traffic dump on Windows as an example how to use their
> > > > >>> evaluation board and software, but it doesn't unveil the bigger picture. At
> > > > >>> least what I think is going on here is that the programming is not so easy as
> > > > >>> just paging. Something is more complicated there.
> > > > >>>
> > > > >>> But at least (and as Mark said) the most of the regmap based drivers got
> > > > >>> the ranges wrong (so, at least there is one bug in the driver).
> > > > >> I can do more experiments if this helps. Do you need a dump of all
> > > > >> regmap accesses or i2c traffic from this driver?
> > > > > It would be helpful! Traces from the failed and non-failed cases
> > > > > till the firmware revision and chip ID reading would be enough to
> > > > > start with.
> > > > 
> > > > I'm sorry for the delay, I was a bit busy with other stuff.
> > > 
> > > No problem and thanks for sharing.
> > > 
> > > > Here are logs (all values are in hex):
> > > > 
> > > > next-20250128 (probe broken):
> > > > root@...get:~# dmesg | grep regmap
> > > > [   14.817604] regmap_write reg 80ee <- 1
> > > > [   14.823036] regmap_read reg 8100 -> 0
> > > > [   14.827631] regmap_read reg 8101 -> 0
> > > > [   14.832130] regmap_read reg 8102 -> 0
> > > 
> > > 
> > > 
> > > > next-20250128 + 1fd60ed1700c reverted (probe okay):
> > > > root@...get:~# dmesg | grep regmap
> > > > [   13.565920] regmap_write reg 80ee <- 1
> > > > [   13.567509] regmap_read reg 8100 -> 17
> > > > [   13.568219] regmap_read reg 8101 -> 4
> > > > [   13.568909] regmap_read reg 8102 -> 93
> > > 
> > > Something is missing here. Like we have an identical start and an immediate
> > > failure. If you did it via printk() approach, it's probably wrong as my patch
> > > uses internal regmap function. Most likely you need to enable trace events
> > > for regmap and collect those for let's say 2 seconds:
> > > 
> > > 	echo 1 > ...trace events...
> > > 	modprobe ...
> > > 	sleep 2
> > > 	echo 0 > ...trace events...
> > > 
> > > and dump the buffer to a file. It might have though more than needed
> > > as some other devices might also use regmap at the same time. I don't remember
> > > if the trace events for regmap have a device instance name field which can be
> > > used as a filter.
> > > 
> > > Alternatively you may also try to add a printk() into regmap core, but I don't
> > > think it's more practical than trace events.
> > 
> > Meanwhile, can you test this patch (on top of non-working case)?
> > 
> > diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
> > index 2314744201b4..f799a7a80231 100644
> > --- a/drivers/base/regmap/regmap.c
> > +++ b/drivers/base/regmap/regmap.c
> > @@ -1553,8 +1553,19 @@ static int _regmap_select_page(struct regmap *map, unsigned int *reg,
> >  		 * virtual copy as well.
> >  		 */
> >  		if (page_chg &&
> > -		    in_range(range->selector_reg, range->window_start, range->window_len))
> > +		    in_range(range->selector_reg, range->window_start, range->window_len)) {
> > +			bool bypass, cache_only;
> > +
> > +			bypass = map->cache_bypass;
> > +			cache_only = map->cache_only;
> > +			map->cache_bypass = false;
> > +			map->cache_only = true;
> > +
> >  			_regmap_update_bits(map, sel_register, mask, val, NULL, false);
> > +
> > +			map->cache_bypass = bypass;
> > +			map->cache_only = cache_only;
> > +		}
> >  	}
> >  
> >  	*reg = range->window_start + win_offset;
> > 
> > If I understood the case, the affected driver doesn't use case and we actually
> > write to the selector register twice which most likely messes up the things.
> 
> Unfortunately I can not comment regarding the LT9611UXC itself, the
> datasheet that I have here is pretty cryptic, incomplete and partially
> written in Mandarin.
> 
> This patch though fixes an issue for me too, So:
> 
> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org> # Qualcomm RB1

Thank you, guys, for reporting an testing, but it seems the simple problem
to solve requires a lot of changes to be done without regressions
(this fix on fix makes a regression to those who have cache enabled), which
means that for now I propose to revert it (or drop) if possible, Mark,
what is your preference?

> > But this is only a theory (since we don't have the traces yet).

-- 
With Best Regards,
Andy Shevchenko



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ