[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YlbJrqma/kk3Lxk6@kroah.com>
Date: Wed, 13 Apr 2022 15:01:34 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Mukesh Ojha <quic_mojha@...cinc.com>
Cc: linux-kernel@...r.kernel.org, tglx@...utronix.de, sboyd@...nel.org,
johannes@...solutions.net, rafael@...nel.org
Subject: Re: Possible race in dev_coredumpm()-del_timer() path
On Wed, Apr 13, 2022 at 04:51:18PM +0530, Mukesh Ojha wrote:
>
>
> On 4/13/2022 4:28 PM, Greg KH wrote:
> > On Wed, Apr 13, 2022 at 03:46:39PM +0530, Mukesh Ojha wrote:
> > > On Wed, Apr 13, 2022 at 07:34:24AM +0200, Greg KH wrote:
> > > > On Wed, Apr 13, 2022 at 10:59:22AM +0530, Mukesh Ojha wrote:
> > > > > Hi All,
> > > > >
> > > > > We are hitting one race due to which try_to_grab_pending() is stuck .
> > > >
> > > > What kernel version are you using?
> > >
> > > 5.10
> >
> > 5.10.0 was released a very long time ago. Please use a more modern
> > kernel release :)
> >
> > > Sorry, for the formatting mess.
> > >
> > > > > In following scenario, while running (p1)dev_coredumpm() devcd device is
> > > > > added to
> > > > > the framework and uevent notification sent to userspace that result in the
> > > > > call to (p2) devcd_data_write()
> > > > > which eventually try to delete the queued timer which in the racy scenario
> > > > > timer is not queued yet.
> > > > > So, debug object report some warning and in the meantime timer is
> > > > > initialized and queued from p1 path.
> > > > > and from p2 path it gets overriden again timer->entry.pprev=NULL and
> > > > > try_to_grab_pending() stuck
> > > p1 p2(X)
> > >
> > > dev_coredump() uevent sent to userspace
> > > device_add() =========================> userspace process X reads the uevents
> > > writes to devcd fd which
> > > results into writes to
> > >
> > > devcd_data_write()
> > > mod_delayed_work()
> > > try_to_grab_pending()
> > > del_timer()
> > > debug_assert_init()
> > > INIT_DELAYED_WORK
> > > schedule_delayed_work
> > > debug_object_fixup()
> >
> > Why do you have object debugging enabled? That's going to take a LONG
> > time, and will find bugs in your code. Perhaps like this one?
> > There is no issue if we disable debug object.
> Here, some client module try to collect dump
> via dev_coredumpm() which creates devcdX device and
> expects userspace to read this data. Here, it might be
> exposing a synchronization issue between dev_coredumpm()
> and devcd_data_write() perhaps, a mutex ??
Any reason you did not answer any of the questions I asked?
{sigh}
Powered by blists - more mailing lists