lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=MeivPz2nOjgFwYscZQpbuXnt=z5JAVMB4uzahQJgKjdKg@mail.gmail.com>
Date:   Tue, 11 Apr 2023 15:11:36 +0200
From:   Bartosz Golaszewski <brgl@...ev.pl>
To:     Linus Walleij <linus.walleij@...aro.org>
Cc:     Naresh Kamboju <naresh.kamboju@...aro.org>,
        Bartosz Golaszewski <bartosz.golaszewski@...aro.org>,
        "open list:GPIO SUBSYSTEM" <linux-gpio@...r.kernel.org>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Anders Roxell <anders.roxell@...aro.org>,
        Linux-Next Mailing List <linux-next@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        lkft-triage@...ts.linaro.org,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>, linux-mm <linux-mm@...ck.org>,
        Arnd Bergmann <arnd@...db.de>, Shuah Khan <shuah@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Pengfei Xu <pengfei.xu@...el.com>, yi1.lai@...el.com
Subject: Re: selftests: gpio: crash on arm64

On Tue, Apr 11, 2023 at 10:57 AM Linus Walleij <linus.walleij@...aro.org> wrote:
>
> On Mon, Apr 10, 2023 at 11:16 AM Naresh Kamboju
> <naresh.kamboju@...aro.org> wrote:
> (...)
> > Anders performed bisection on this problem.
> > The bisection have been poing to this commit log,
> >   first bad commit: [24c94060fc9b4e0f19e6e018869db46db21d6bc7]
> >     gpiolib: ensure that fwnode is properly set
>
> I don't think this is the real issue.
>
> (...)
> > # 2.  Module load error tests
> > # 2.1 gpio overflow
> (...)
> > [   88.900984] Freed in software_node_release+0xdc/0x108 age=34 cpu=1 pid=683
> > [   88.907899]  __kmem_cache_free+0x2a4/0x2e0
> > [   88.912024]  kfree+0xc0/0x1a0
> > [   88.915015]  software_node_release+0xdc/0x108
> > [   88.919402]  kobject_put+0xb0/0x220
> > [   88.922919]  software_node_notify_remove+0x98/0xe8
> > [   88.927741]  device_del+0x184/0x380
> > [   88.931259]  platform_device_del.part.0+0x24/0xa8
> > [   88.935995]  platform_device_unregister+0x30/0x50
>
> I think the refcount is wrong on the fwnode.
>
> The chip is allocated with devm_gpiochip_add_data() which will not call
> gpiochip_remove() until all references are removed by calling
> devm_gpio_chip_release().
>
> Add a pr_info() devm_gpio_chip_release() in drivers/gpio/gpiolib-devres.c
> and see if the callback is even called. I think this could be the
> problem: if that isn't cleaned up, there will be dangling references.
>
> diff --git a/drivers/gpio/gpiolib-devres.c b/drivers/gpio/gpiolib-devres.c
> index fe9ce6b19f15..30a0622210d7 100644
> --- a/drivers/gpio/gpiolib-devres.c
> +++ b/drivers/gpio/gpiolib-devres.c
> @@ -394,6 +394,7 @@ static void devm_gpio_chip_release(void *data)
>  {
>         struct gpio_chip *gc = data;
>
> +       pr_info("GPIOCHIP %s WAS REMOVED BY DEVRES\n", gc->label);
>         gpiochip_remove(gc);
>  }
>
> If this isn't working we need to figure out what is holding a reference to
> the gpiochip.
>
> I don't know how the references to the gpiochip fwnode is supposed to
> drop to zero though? I didn't work with mockup much ...
>
> What I could think of is that maybe the mockup driver need a .shutdown()
> callback to forcibly call gpiochip_remove(), and in that case it should
> be wrapped in a non-existining devm_gpiochip_remove() since devres
> is used to register it.
>
> Bartosz will know better though! I am pretty sure he has this working
> flawlessly so the tests must be doing something weird which is leaving
> references around.
>
> Yours,
> Linus Walleij

Interestingly I'm not seeing this neither with gpio-sim selftests nor
with any of the libgpiod tests which suggests it's the gpio-mockup
module that's doing something wrong (or very right in which case it
uncovers some otherwise hidden bug). Anyway, I'll try to spend some
time on it and figure it out, although I'd like to be done with
gpio-mockup altogether already.

Bart

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ