lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150624215649.GA16000@agluck-desk.sc.intel.com>
Date:	Wed, 24 Jun 2015 14:56:49 -0700
From:	"Luck, Tony" <tony.luck@...el.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	bp@...e.de, linux-edac@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: changing format/size of data in TRACE_EVENT(extlog_mem_event)

In <ras/ras_event.h> we define a trace event for memory errors.
The last field is:

                __field_struct(struct cper_mem_err_compact, data)

where the structure is defined in <linux/cper.h> as:

struct cper_mem_err_compact {
        __u64   validation_bits;
        __u16   node;
        __u16   card;
        __u16   module;
        __u16   bank;
        __u16   device;
        __u16   row;
        __u16   column;
        __u16   bit_pos;
        __u64   requestor_id;
        __u64   responder_id;
        __u64   target_id;
        __u16   rank;
        __u16   mem_array_handle;
        __u16   mem_dev_handle;
};

This structure was defined based on the useful bits in the
UEFI 2.4 spec appendix N, section 2.5 "Memory Error Section".

But UEFI have released a new version of the spec ... 2.5

  http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf

and things have been updated to cope with ever increasing memory sizes
thanks to Moore's law. The old structure got a couple of tweaks as a
quick band-aid to handle current problems (__u16 isn't big enough for
the "row" entry for some 64GB DIMMs, so they squeezed bits 16:17 into a
reserved field).  But looking to the future they added a whole new GUID
record "Memory Error Section 2" that increases the width of the device,
row, column, rank and bit_pos fields from u16 to u32 and adds a couple
of completely new fields.

So the question is - how can we update the trace event to include these
new wider fields with the minimum pain to applications that look at it?
I don't know if there are any other consumers besides rasdaemon at the
moment ... but we don't want ugly transitions where you have to guess
which version of the application you need to run to work with a given
kernel version.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ