lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPE3x15Qxy5+C3_1v6a6YBoz03=NVoJMz-yfc6qih_=_js8=ug@mail.gmail.com>
Date: Fri, 18 Jul 2025 16:31:00 -0700
From: Salomon Dushimirimana <salomondush@...gle.com>
To: Sathya Prakash Veerichetty <sathya.prakash@...adcom.com>
Cc: bvanassche@....org, James.Bottomley@...senpartnership.com, 
	kashyap.desai@...adcom.com, linux-kernel@...r.kernel.org, 
	linux-scsi@...r.kernel.org, martin.petersen@...cle.com, 
	mpi3mr-linuxdrv.pdl@...adcom.com, sreekanth.reddy@...adcom.com, 
	sumit.saxena@...adcom.com
Subject: Re: [PATCH v2] scsi: mpi3mr: Emit uevent on controller diagnostic fault

When the controller encounters a fatal error event, we want to notify
our userspace tools to react to these events and pull the
corresponding logs/snapdump from the ioc. There's a list of other
drivers doing something similar, such as drivers/scsi/qla2xxx,
drivers/scsi/qedf/qedf_dbg.c, etc.

So the mpi3mr_issue_reset function only supports two types of resets,
i.e MPI3_SYSIF_HOST_DIAG_RESET_ACTION_SOFT_RESET and
MPI3_SYSIF_HOST_DIAG_RESET_ACTION_DIAG_FAULT for now. From the code,
it seems like only diag fault reset generages a snapdump, soft resets
do not, hence why we only emit the fatal uevent on diag fault.

Thanks,
Salomon Dushimirimana

Salomon Dushimirimana


On Fri, Jul 18, 2025 at 8:43 AM Sathya Prakash Veerichetty
<sathya.prakash@...adcom.com> wrote:
>
> On Thu, Jul 17, 2025 at 1:40 PM Salomon Dushimirimana
> <salomondush@...gle.com> wrote:
> >
> > Introduces a uevent mechanism to notify userspace when the controller
> > undergoes a reset due to a diagnostic fault. A new function,
> > mpi3mr_fault_event_emit(), is added and called from the reset path. This
> > function filters for a diagnostic fault type
> > (MPI3_SYSIF_HOST_DIAG_RESET_ACTION_DIAG_FAULT) and generates a uevent
> > containing details about the event:
> >
> > - DRIVER: mpi3mr in this case
> > - HBA_NUM: scsi host id
> > - EVENT_TYPE: indicates fatal error
> > - RESET_TYPE: type of reset that has occurred
> > - RESET_REASON: specific reason for the reset
> >
> > This will allow userspace tools to subscribe to these events and take
> > appropriate action.
> What is the reason for userpace tools to know these events and what
> user space tools we are talking about here?  Also, on what basis it is
> decided only diag fault reset is considered as FATAL.  I would prefer
> to understand the actual requirement before ACKing this patch.  If we
> need this kind of user space notification then it would be better to
> make it generic and let the notification sent for all firmware fault
> codes.
>
> >
> > Signed-off-by: Salomon Dushimirimana <salomondush@...gle.com>
> > ---
> > Changes in v2:
> > - Addressed feedback from Bart regarding use of __free(kfree) and more
> >
> >  drivers/scsi/mpi3mr/mpi3mr_fw.c | 37 +++++++++++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> >
> > diff --git a/drivers/scsi/mpi3mr/mpi3mr_fw.c b/drivers/scsi/mpi3mr/mpi3mr_fw.c
> > index 1d7901a8f0e40..a050c4535ad82 100644
> > --- a/drivers/scsi/mpi3mr/mpi3mr_fw.c
> > +++ b/drivers/scsi/mpi3mr/mpi3mr_fw.c
> > @@ -1623,6 +1623,42 @@ static inline void mpi3mr_set_diagsave(struct mpi3mr_ioc *mrioc)
> >         writel(ioc_config, &mrioc->sysif_regs->ioc_configuration);
> >  }
> >
> > +/**
> > + * mpi3mr_fault_uevent_emit - Emit uevent for a controller diagnostic fault
> > + * @mrioc: Pointer to the mpi3mr_ioc structure for the controller instance
> > + * @reset_type: The type of reset that has occurred
> > + * @reset_reason: The specific reason code for the reset
> > + *
> > + * This function is invoked when the controller undergoes a reset. It specifically
> > + * filters for MPI3_SYSIF_HOST_DIAG_RESET_ACTION_DIAG_FAULT and ignores other
> > + * reset types, such as soft resets.
> > + */
> > +static void mpi3mr_fault_uevent_emit(struct mpi3mr_ioc *mrioc, u16 reset_type,
> > +       u16 reset_reason)
> > +{
> > +       struct kobj_uevent_env *env __free(kfree);
> > +
> > +       if (reset_type != MPI3_SYSIF_HOST_DIAG_RESET_ACTION_DIAG_FAULT)
> > +               return;
> > +
> > +       env = kzalloc(sizeof(*env), GFP_KERNEL);
> > +       if (!env)
> > +               return;
> > +
> > +       if (add_uevent_var(env, "DRIVER=%s", mrioc->driver_name))
> > +               return;
> > +       if (add_uevent_var(env, "HBA_NUM=%u", mrioc->id))
> > +               return;
> > +       if (add_uevent_var(env, "EVENT_TYPE=FATAL_ERROR"))
> > +               return;
> > +       if (add_uevent_var(env, "RESET_TYPE=%s", mpi3mr_reset_type_name(reset_type)))
> > +               return;
> > +       if (add_uevent_var(env, "RESET_REASON=%s", mpi3mr_reset_rc_name(reset_reason)))
> > +               return;
> > +
> > +       kobject_uevent_env(&mrioc->shost->shost_gendev.kobj, KOBJ_CHANGE, env->envp);
> > +}
> > +
> >  /**
> >   * mpi3mr_issue_reset - Issue reset to the controller
> >   * @mrioc: Adapter reference
> > @@ -1741,6 +1777,7 @@ static int mpi3mr_issue_reset(struct mpi3mr_ioc *mrioc, u16 reset_type,
> >             ioc_config);
> >         if (retval)
> >                 mrioc->unrecoverable = 1;
> > +       mpi3mr_fault_uevent_emit(mrioc, reset_type, reset_reason);
> >         return retval;
> >  }
> >
> > --
> > 2.50.0.727.gbf7dc18ff4-goog
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ