# dmesg # ndctl inject-error namespace0.0 -n 2 -B 8210 [29076.551909] {3}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 [29076.561136] {3}[Hardware Error]: event severity: recoverable [29076.567453] {3}[Hardware Error]: Error 0, type: recoverable [29076.573769] {3}[Hardware Error]: section_type: memory error [29076.580182] {3}[Hardware Error]: error_status: Storage error in DRAM memory (0x0000000000000400) [29076.590281] {3}[Hardware Error]: physical_address: 0x00000040a0602400 <== 1st poison @ 0x400 [29076.597664] {3}[Hardware Error]: physical_address_mask: 0xffffffffffffff00 [29076.605532] {3}[Hardware Error]: node:0 card:0 module:1 [29076.611655] {3}[Hardware Error]: error_type: 14, scrub uncorrected error [29076.619447] Memory failure: 0x40a0602: recovery action for dax page: Recovered [29076.627519] mce: [Hardware Error]: Machine check events logged [29076.634033] nfit ACPI0012:00: addr in SPA 1 (0x4080000000, 0x1f80000000) [29076.648805] nd_bus ndbus0: XXX nvdimm_bus_add_badrange: (0x40a0602000, 0x1000) <== 1st call to nfit_handle_mce [29077.877682] EDAC MC0: 1 UE memory read error on CPU_SrcID#0_MC#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x40a0602 offset:0x400 grain:32 - err_code:0x0000:0x009f SystemAddress:0x40a0602400 DevicePhysicalAddress:0x30602400 ProcessorSocketId:0x0 MemoryControllerId:0x0 ChannelAddress:0x430602400 ChannelId:0x0 PhysicalRankId:0x0 DimmSlotId:0x1 ChipSelect:0x4) [29078.596454] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0 [29078.605682] {4}[Hardware Error]: event severity: recoverable [29078.611997] {4}[Hardware Error]: Error 0, type: recoverable [29078.618313] {4}[Hardware Error]: section_type: memory error [29078.624727] {4}[Hardware Error]: error_status: Storage error in DRAM memory (0x0000000000000400) [29078.634817] {4}[Hardware Error]: physical_address: 0x00000040a0602600 <== 2nd poison @ 0x600 [29078.642200] {4}[Hardware Error]: physical_address_mask: 0xffffffffffffff00 [29078.650069] {4}[Hardware Error]: node:0 card:0 module:1 [29078.656184] {4}[Hardware Error]: error_type: 14, scrub uncorrected error [29078.663944] Memory failure: 0x40a0602: recovery action for dax page: Recovered [29078.672011] mce: [Hardware Error]: Machine check events logged [29079.595327] nfit ACPI0012:00: XXX addr in SPA 1 (0x4080000000, 0x1f80000000) [29079.603204] nfit ACPI0012:00: XXX new code nvdimm_bus_add_badrange [29079.610106] nd_bus ndbus0: XXX nvdimm_bus_add_badrange: (0x40a0602000, 0x1000) <== 2nd call to nfit_handle_mce [29079.949531] EDAC MC0: 1 UE memory read error on CPU_SrcID#0_MC#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x40a0602 offset:0x600 grain:32 - err_code:0x0000:0x009f SystemAddress:0x40a0602600 DevicePhysicalAddress:0x30602600 ProcessorSocketId:0x0 MemoryControllerId:0x0 ChannelAddress:0x430602600 ChannelId:0x0 PhysicalRankId:0x0 DimmSlotId:0x1 ChipSelect:0x4) [29102.630372] nd_bus ndbus0: XXX nvdimm_bus_add_badrange: (0x40a0602400, 0x100) <== short ARS found 2 poisons [29102.638341] nd_bus ndbus0: XXX nvdimm_bus_add_badrange: (0x40a0602600, 0x100) { "dev":"namespace0.0", "mode":"fsdax", "map":"dev", "size":33820770304, "uuid":"a1b0f07f-747f-40a8-bcd4-de1560a1ef75", "sector_size":512, "align":2097152, "blockdev":"pmem0", "badblock_count":8, "badblocks":[ { "offset":8208, "length":8, "dimms":[ "nmem0" ] } ] }