lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <200911031942.52018.d.rye@roadtech.co.uk>
Date:	Tue, 3 Nov 2009 19:42:51 +0000
From:	"J. David Rye of Roadtech" <d.rye@...dtech.co.uk>
To:	linux-kernel@...r.kernel.org
Subject: Serial interfaces and the Multiple device RAID driver

Hi

This is a bit of a potshot, but I am hoping someone is going to be able to 
point me in an appropriate direction. I am having problems with serial 
heartbeats, and multi disk RAID1 arrays.

The issue shows up as corrupt messages on the serial heartbeats, and overrun 
messages in /var/log/messages

kernel: ttyS0: 2 input overrun(s)

I have 4 P4 computers that are very similar based around Supermicro P4SCT+  
motherboards with 3.2GHz P4 processors. The machines have two SATA 
controllers there are 2 ports on the Intel 6300ESB controller and 4 on a 
Marvel MV88SX5041 

The machines are arranged as two High Availability pairs.
The machines are currently running Fedora10 kernel 
2.6.27.37-170.2.104.fc10.i686

If I run the serial link in to a low spec 1GHz VIA box, with a single disk 
messages can be logged without any errors so it is not the serial cables or 
base band modems.

I have tried dropping the baud rate on machines 3 and 4 to 9600 rather than 
19200 this does not seam to make any difference.

Corruption shows up as both missing and corrupt characters.
slow response to serial port tinterupts will result in missing characters, 
though I have not in the past noted corrupt characters as a result.

Dropping the baud rate does not appear to make a difference.

Any helpfull suggestions would be appreciated.


Machine 1: only 3 or 4 overruns logged per day.

	sda Marvel controller single disk
	sdb Intel controller MD RAID
	sdc Intel controller MD RAID.

	md0=sdb1, sdc1
	md1=sdb2, sdc2
	md2=sdb3, sdc3

cat /proc/interrupts
           CPU0       CPU1
  0:        191          0   IO-APIC-edge      timer
  1:       6929          0   IO-APIC-edge      i8042
  3:          2          0   IO-APIC-edge
  4:          2          0   IO-APIC-edge
  6:          2          0   IO-APIC-edge      floppy
  7:          0          0   IO-APIC-edge      parport0
  8:          1          0   IO-APIC-edge      rtc0
  9:          0          0   IO-APIC-fasteoi   acpi
 12:        678          0   IO-APIC-edge      i8042
 14:        585          0   IO-APIC-edge      ata_piix
 15:   15827158          0   IO-APIC-edge      ata_piix
 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:  342936579          0   IO-APIC-fasteoi   eth0
 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 21: 2206183303          0   IO-APIC-fasteoi   serial
 23:          0          0   IO-APIC-fasteoi   ehci_hcd:usb1
 24:   33838328          0   IO-APIC-fasteoi   eth4
 25: 1716154148          0   IO-APIC-fasteoi   eth1
 26: 2260299116          0   IO-APIC-fasteoi   eth2
 27:   25961661          0   IO-APIC-fasteoi   sata_mv, eth3
NMI:          0          0   Non-maskable interrupts
LOC:  345515518 1254623910   Local timer interrupts
RES:    1707850    4921652   Rescheduling interrupts
CAL:      80841      43449   function call interrupts
TLB:     306776     264832   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0
 

Machine 2: no overruns logged in last week.

	sda Intel controller MD RAID
	sdb Intel controller MD RAID.

	md0=sda1, sdb1
	md1=sda2, sdb2
	md2=sda3, sdb3


cat /proc/interrupts
           CPU0       CPU1
  0:        132          0   IO-APIC-edge      timer
  1:        132          0   IO-APIC-edge      i8042
  3:          2          0   IO-APIC-edge
  4:          2          0   IO-APIC-edge
  6:          2          0   IO-APIC-edge      floppy
  7:          0          0   IO-APIC-edge      parport0
  8:          1          0   IO-APIC-edge      rtc0
  9:          0          0   IO-APIC-fasteoi   acpi
 12:        138          0   IO-APIC-edge      i8042
 14:    6680353          0   IO-APIC-edge      ata_piix
 15:          0          0   IO-APIC-edge      ata_piix
 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:    7901522          0   IO-APIC-fasteoi   eth0
 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 21: 1597304083          0   IO-APIC-fasteoi   serial
 23:        367          0   IO-APIC-fasteoi   ehci_hcd:usb1
 25:  496342556          0   IO-APIC-fasteoi   eth1
 26:  493466471          0   IO-APIC-fasteoi   eth2
 27:    2456396          0   IO-APIC-fasteoi   sata_mv, eth3
NMI:          0          0   Non-maskable interrupts
LOC:  241045995   58231750   Local timer interrupts
RES:      89012     134076   Rescheduling interrupts
CAL:       4404       5700   function call interrupts
TLB:      11725      15424   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0


Machine 3: This is the most interesting, with drive C as part of the RAID 
array lots of errors with the array degraded just 1 or 2 per day like machine 
1

	sda Intel controller MD RAID
	sdb Intel controller MD RAID
	sdc Marvel controller MD RAID

	md0=sda1, sdb1, sdc1
	md1=sda2, sdb2, sdb1
	md2=sda3, sdb3, sdc1

cat /proc/interrupts
           CPU0
  0:        138   IO-APIC-edge      timer
  1:        281   IO-APIC-edge      i8042
  6:          2   IO-APIC-edge      floppy
  8:          1   IO-APIC-edge      rtc0
  9:          0   IO-APIC-fasteoi   acpi
 12:        121   IO-APIC-edge      i8042
 14:          0   IO-APIC-edge      ata_piix
 15:          0   IO-APIC-edge      ata_piix
 18:   51082134   IO-APIC-fasteoi   ata_piix, eth0
 21:   38470807   IO-APIC-fasteoi   serial
 25:   50309334   IO-APIC-fasteoi   eth1
 27:     127456   IO-APIC-fasteoi   sata_mv
NMI:          0   Non-maskable interrupts
LOC:   13833559   Local timer interrupts
RES:          0   Rescheduling interrupts
CAL:          0   function call interrupts
TLB:          0   TLB shootdowns
TRM:          0   Thermal event interrupts
SPU:          0   Spurious interrupts
ERR:          0
MIS:          0

Machine 4 In normal use only 3 or 4 overruns logged per day. However if 
workload transferred from Machine 3 this rises to lots.


	sda Marvel controller MD RAID
	sdb Marvel controller MD RAID
	sdc Marvel controller MD RAID

	md0=sda1, sdb1, sdc1
	md1=sda2, sdb2, sdb1
	md2=sda3, sdb3, sdc1

cat /proc/interrupts
           CPU0       CPU1
  0:        130          0   IO-APIC-edge      timer
  1:          9       8528   IO-APIC-edge      i8042
  3:          2          0   IO-APIC-edge
  4:          2          0   IO-APIC-edge
  6:          2          0   IO-APIC-edge      floppy
  7:          0          0   IO-APIC-edge      parport0
  8:          1          0   IO-APIC-edge      rtc0
  9:          0          0   IO-APIC-fasteoi   acpi
 12:        142       3254   IO-APIC-edge      i8042
 14:          0          0   IO-APIC-edge      ata_piix
 15:          0          0   IO-APIC-edge      ata_piix
 16:          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:        348  285725468   IO-APIC-fasteoi   ata_piix, eth0
 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 21:  114647339      45683   IO-APIC-fasteoi   serial
 23:    1591213          0   IO-APIC-fasteoi   ehci_hcd:usb1
 25:  281669867          0   IO-APIC-fasteoi   eth1
 27:    5954853          0   IO-APIC-fasteoi   sata_mv
NMI:          0          0   Non-maskable interrupts
LOC:   84506497   72450763   Local timer interrupts
RES:     247607     206442   Rescheduling interrupts
CAL:       4544       2991   function call interrupts
TLB:       7501      26683   TLB shootdowns
TRM:          0          0   Thermal event interrupts
SPU:          0          0   Spurious interrupts
ERR:          0
MIS:          0



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ