lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F959FDE.2070304@redhat.com>
Date:	Mon, 23 Apr 2012 18:30:54 +0000
From:	Mauro Carvalho Chehab <mchehab@...hat.com>
To:	Borislav Petkov <bp@...64.org>
CC:	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Doug Thompson <norsk5@...oo.com>
Subject: Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers

Em 23-04-2012 17:49, Borislav Petkov escreveu:
> Subject: "edac.h: Prepare to handle with generic layers"
> 
> what does that even mean?
> 
> Do you per chance mean
> 
> 	"Add generic layers for describing a memory location"
> 
> or something similar?
> 
> On Mon, Apr 16, 2012 at 05:12:12PM -0300, Mauro Carvalho Chehab wrote:
>> The edac core were written with the idea that memory controllers
>> are able to directly access csrows, and that the channels are
>> used inside a csrows select.
>>
>> This is not true for FB-DIMM and RAMBUS memory controllers.
>>
>> Also, some recent advanced memory controllers don't present a per-csrows
>> view. Instead, they view memories as DIMM's, instead of ranks, accessed
> 
> 				       DIMMs
> 
>> via csrow/channel.
>>
>> So, changes are needed in order to allow the EDAC core to
>> work with all types of architectures.
>>
>> As a preparation for handling non-csrows based memory controllers,
> 
>  In preparation...
> 
>> adds some memory structs and a macro:
> 
>   add some...
> 
>> enum hw_event_mc_err_type: describes the type of error
>> 			   (corrected, uncorrected, fatal)
>>
>> To be used by the new edac_mc_handle_error function;
>>
>> enum edac_mc_layer: describes the type of a given Memory
> 
> 						    memory
> 
>> architecture layer (branch, channel, slot, csrow).
>>
>> struct edac_mc_layer: describes the properties of a memory
>> 		      layer (type, size, and if the layer
>> 		      will be used on a virtual csrow.
>>
>> GET_POS() - as the number of layers can vary from 1 to 3,
>> this macro converts from an address with up to 3 layers into
>> a linear address.
>>
>> Cc: Doug Thompson <norsk5@...oo.com>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@...hat.com>
>> ---
>>  include/linux/edac.h |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 files changed, 82 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/edac.h b/include/linux/edac.h
>> index 8b78bd0..0fdf6ba 100644
>> --- a/include/linux/edac.h
>> +++ b/include/linux/edac.h
>> @@ -67,6 +67,25 @@ enum dev_type {
>>  #define DEV_FLAG_X64		BIT(DEV_X64)
>>  
>>  /**
>> + * enum hw_event_mc_err_type - type of the detected error
>> + *
>> + * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
>> + *				corrected error was detected
>> + * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
>> + *				can't be corrected by ECC, but it is not
>> + *				factal (maybe it is on an unused memory area,
> 
> 				fatal
> 

Fixed all the above.

>> + *				or the memory controller could recover from
>> + *				it for example, by re-trying the operation).
>> + * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
>> + *				be recovered.
>> + */
>> +enum hw_event_mc_err_type {
>> +	HW_EVENT_ERR_CORRECTED,
>> +	HW_EVENT_ERR_UNCORRECTED,
>> +	HW_EVENT_ERR_FATAL,
> 
> Need a terminating elem here:
> 	HW_EVENT_ERR_NUM,

Why? There's no place where the number of types is needed. It should be noticed
no other EDAC enum's have an element for the count.

IMHO, we should't add any code there that won't be used. If latter needed, such
change can be added anytime.

> 
>> +};
>> +
>> +/**
>>   * enum mem_type - memory types. For a more detailed reference, please see
>>   *			http://en.wikipedia.org/wiki/DRAM
>>   *
>> @@ -308,7 +327,69 @@ enum scrub_type {
>>   * PS - I enjoyed writing all that about as much as you enjoyed reading it.
>>   */
>>  
>> -/* FIXME: add a per-dimm ce error count */
>> +/**
>> + * enum edac_mc_layer - memory controller hierarchy layer
>> + *
>> + * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
>> + * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
>> + * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
>> + * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
>> + *
>> + * This enum is used by the drivers to tell edac_mc_sysfs what name should
>> + * be used when describing a memory stick location.
>> + */
>> +enum edac_mc_layer_type {
>> +	EDAC_MC_LAYER_BRANCH,
>> +	EDAC_MC_LAYER_CHANNEL,
>> +	EDAC_MC_LAYER_SLOT,
>> +	EDAC_MC_LAYER_CHIP_SELECT,
> 
> ditto.

ditto.

> 
>> +};
>> +
>> +/**
>> + * struct edac_mc_layer - describes the memory controller hierarchy
>> + * @layer:		layer type
>> + * @size:maximum size of the layer
>> + * @is_csrow:		This layer is part of the "csrow" when old API
>> + *			compatibility mode is enabled. Otherwise, it is
>> + *			a channel
>> + */
>> +struct edac_mc_layer {
>> +	enum edac_mc_layer_type	type;
>> +	unsigned		size;
>> +	bool			is_csrow;
>> +};
> 
> Huh, why do you need is_csrow? Can't do
> 
> 	type = EDAC_MC_LAYER_CHIP_SELECT;
> 
> ?

No, that's different. For a csrow-based memory controller, is_csrow is equal to
type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
is used to mark with layers will be used for the "fake csrow" exported by the
EDAC core by the legacy API.

This field will be dropped together with the legacy API on some future Kernel,
but, for now, it is needed, in order to avoid breaking the userspace API.

> 
>> +
>> +/*
>> + * Maximum number of layers used by the memory controller to uniquelly
> 
> 								uniquely

Fixed.

> 
>> + * identify a single memory stick.
>> + * NOTE: incrementing it would require changes at edac_mc_handle_error()
>> + * and at the routines at edac_mc_sysfs that create layers
> 
> Maybe add their names here with a regex or so: edac_mc_blabla_*
> ?

With regards to the changes at edac_mc_sysfs,  it will likely affect all per-dimm
routines, plus the counters reset logic. The problem of pointing to a set of
routines that need changes is that this list can/will change with time.

So, the intention behind this note is not to give an exhaustive list of what should
be changed, if EDAC_MAX_LAYERS is incremented. Instead, it is meant to give a
clue that incrementing the number of layers is not as easy as just changing
it: it would require to change the number of layers also at the code.

> 
>> + */
>> +#define EDAC_MAX_LAYERS		3
>> +
>> +/*
>> + * A loop could be used here to make it more generic, but, as we only have
>> + * 3 layers, this is a little faster. By design, layers can never be 0 or
>> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
>> + * during the memory allocation routine, with would point to the developer
>> + * that he's doing something wrong.
>> + */
>> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
> 
> This is returning size per layers so it cannot be GET_POS(), AFAICT.
> EDAC_GET_SIZE or similar maybe?

This is not returning the size, per layers. It is returning a pointer to the
structure that holds the dimm.

> 
>> +	typeof(var) __p;						\
>> +	if ((nlayers) == 1)						\
>> +		__p = &var[lay0];					\
>> +	else if ((nlayers) == 2)					\
>> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
>> +	else if ((nlayers) == 3)					\
>> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
>> +			    ((layers[1]).size * (lay0))))];		\
>> +	else								\
>> +		__p = NULL;						\
>> +	__p;								\
>> +})
> 

Regards,
Mauro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ