lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZoLWH9-JPFQB4YSu@pluto>
Date: Mon, 1 Jul 2024 17:15:27 +0100
From: Cristian Marussi <cristian.marussi@....com>
To: Luke Parkin <luke.parkin@....com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	arm-scmi@...r.kernel.org, sudeep.holla@....com,
	cristian.marussi@....com
Subject: Re: [PATCH 2/3] Track basic SCMI statistics

On Mon, Jul 01, 2024 at 03:28:50PM +0100, Luke Parkin wrote:
> Add scmi_debug_stats struct with atomic_t types to prevent racing.
> 

Hi

"Add SCMI debug statisticts based on atomic types to prevent races."

if you want to specify why you did it this way in the commit message,
but this seems really more something for a comment in the doxygen that
on the commit log.

> Add tracking of 5 initial statistics
> - sent_ok & sent_fail
> - response_ok & dlyd_response_ok
> - xfers_response_timeout

Morover I would not specify here in the log what you added, you can see
it from the code. "Add some initial stats counter" if you want.

> 
> Signed-off-by: Luke Parkin <luke.parkin@....com>
> ---
>  drivers/firmware/arm_scmi/driver.c | 46 +++++++++++++++++++++++++++++-
>  1 file changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c
> index 6b6957f4743f..f69dff699d48 100644
> --- a/drivers/firmware/arm_scmi/driver.c
> +++ b/drivers/firmware/arm_scmi/driver.c
> @@ -125,6 +125,22 @@ struct scmi_debug_info {
>  	bool is_atomic;
>  };
>  
> +/**
> + * struct scmi_debug_stats - Debug statistics
> + * @sent_ok: Count of successful sends
> + * @sent_fail: Count of failed sends
> + * @response_ok: Count of successful responses
> + * @dlyd_response_ok: Count of successful delayed responses
> + * @xfers_response_timeout: Count of xfer response timeouts
> + */
> +struct scmi_debug_stats {
> +	atomic_t sent_ok;
> +	atomic_t sent_fail;
> +	atomic_t response_ok;
> +	atomic_t dlyd_response_ok;
> +	atomic_t xfers_response_timeout;
> +};
> +
>  /**
>   * struct scmi_info - Structure representing a SCMI instance
>   *
> @@ -141,6 +157,7 @@ struct scmi_debug_info {
>   * @protocols: IDR for protocols' instance descriptors initialized for
>   *	       this SCMI instance: populated on protocol's first attempted
>   *	       usage.
> + * @stats: Contains several atomic_t's for tracking various statistics
>   * @protocols_mtx: A mutex to protect protocols instances initialization.
>   * @protocols_imp: List of protocols implemented, currently maximum of
>   *		   scmi_revision_info.num_protocols elements allocated by the
> @@ -174,6 +191,7 @@ struct scmi_info {
>  	struct idr tx_idr;
>  	struct idr rx_idr;
>  	struct idr protocols;
> +	struct scmi_debug_stats stats;

Pleaase move this field right after scmi_debug_info down below, so that
we keep all debug stuff together.

Also this is an embedded structure, not a bare pointer to dynamically
allocated stuff (and this is fine...) so you should probably ifdef
ARM_SCMI_DEBUG_STATISTICS at compile time the presence of this field itself
so as not to waste memory for something you never use...but this is
opinable because it will also, on the other side, polllute a bit the code
with unpleasant ifdeffery... so maybe Sudeep wont like it...the other option
would be to make this a pointer and conditionaly devm_kzalloc a struct to this
pointer (like scmi_debug_info)

>  	/* Ensure mutual exclusive access to protocols instance array */
>  	struct mutex protocols_mtx;
>  	u8 *protocols_imp;
> @@ -1143,7 +1161,12 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
>  						SCMI_RAW_REPLY_QUEUE,
>  						cinfo->id);
>  	}
> -
> +	if (IS_ENABLED(CONFIG_ARM_SCMI_DEBUG_STATISTICS)) {
> +		if (xfer->hdr.type == MSG_TYPE_DELAYED_RESP)
> +			atomic_inc(&info->stats.dlyd_response_ok);
> +		else
> +			atomic_inc(&info->stats.response_ok);
> +	}

Ok, so IMO, this is the main core thing to rework in this series: the
"counting" operation/block should be defined as a macro so that it can
be fully compiled out when STATS=n, because these are counters
incremented on the hot path for each message, not just once in a while,
so the above if(IS_ENABELD()) now will be there and evaluated even when
STATS=n.

Something like:

	#ifdef CONFIG_ARM_SCMI_DEBUG_STATISTICS
	#define SCMI_LOG_STATS(counter)			\
		<your magic here> 			\
	#else
	#define SCMI_LOG_STATS(counter)
	#endif

.... I have not thought it through eh...so it could be radically
different...the point is ... the counting machinery should just
disappear at compile time when STATS=n

Moreover please define this macros magic here in this patch BUT split
out in a distinct patch all the places where you make use of it, so that
this patch contains only definitions of mechanisms and struct and
another patch will contain all the places where stats are monitored.

>  	scmi_xfer_command_release(info, xfer);
>  }
>  
> @@ -1279,6 +1302,12 @@ static int scmi_wait_for_reply(struct device *dev, const struct scmi_desc *desc,
>  		}
>  	}
>  
> +	if (IS_ENABLED(CONFIG_ARM_SCMI_DEBUG_STATISTICS) && ret == -ETIMEDOUT) {
> +		struct scmi_info *info =
> +					handle_to_scmi_info(cinfo->handle);
> +		atomic_inc(&info->stats.xfers_response_timeout);
> +	}
> +

Ditto.

>  	return ret;
>  }
>  
> @@ -1414,6 +1443,13 @@ static int do_xfer(const struct scmi_protocol_handle *ph,
>  	trace_scmi_xfer_end(xfer->transfer_id, xfer->hdr.id,
>  			    xfer->hdr.protocol_id, xfer->hdr.seq, ret);
>  
> +	if (IS_ENABLED(CONFIG_ARM_SCMI_DEBUG_STATISTICS)) {
> +		if (ret == 0)
> +			atomic_inc(&info->stats.sent_ok);
> +		else
> +			atomic_inc(&info->stats.sent_fail);
> +	}
> +

Ditto.

>  	return ret;
>  }
>  
> @@ -2994,6 +3030,14 @@ static int scmi_probe(struct platform_device *pdev)
>  	handle->devm_protocol_get = scmi_devm_protocol_get;
>  	handle->devm_protocol_put = scmi_devm_protocol_put;
>  
> +	if (IS_ENABLED(CONFIG_ARM_SCMI_DEBUG_STATISTICS)) {
> +		atomic_set(&info->stats.response_ok, 0);
> +		atomic_set(&info->stats.sent_fail, 0);
> +		atomic_set(&info->stats.sent_ok, 0);
> +		atomic_set(&info->stats.dlyd_response_ok, 0);
> +		atomic_set(&info->stats.xfers_response_timeout, 0);
> +	}
> +

I think that you can just drop this zero initializations because the
stats are part of the info structure which is created devm_kzalloc'ed...
so zerod at creation time AND indeed atomic_t is just a structure containing
an int counter which will be already zeroed too...I dont think that any
other special init is done, it is already zero, and nothing more is done
by your atomic_set(v, 0)...but feel free to verify I am not missing
something, please.

Thanks,
Cristian


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ