Compute-Cluster

= compute0.hateotu.de =
 * Fujitsu RX300 S4

compute0.hateotu.de 10.204.3.220 Proxmox VE 6 SSH @ 22, HTTPS @ 8006


 * 48GB memory, 12* 4GB DDR2 FB
 * 2* E5420 (4+0 cores per socket, @2.5GHz)
 * SAS1
 * LSI SAS1068E-based controller, flashed to HBA mode
 * 2* 73GB 15k 2.5"
 * FC
 * FC-HBA Emulex Zephyr-X 2*4G
 * 10:00:00:00:c9:77:e8:6c -> sw1-1 p1
 * 10:00:00:00:c9:77:e8:6d -> free
 * FC-HBA QLogic QLE2460 1* 4G
 * 21:00:00:1b:32:09:b9:b6 -> sw2-1 p3


 * storage
 * OS on ZFS, mirror of 2* 73GB
 * imported LUNs from storage0 via redundant FC
 * multipathd

out-of-band management
irmc-compute0.hateotu.de 10.204.3.225 with KVM license

= compute1.hateotu.de =
 * Fujitsu RX300 S4

compute1.hateotu.de 10.204.3.221 Proxmox VE 6 SSH @ 22, HTTPS @ 8006


 * 48GB memory, 12* 4GB DDR2 FB
 * 2* E5420 (4+0 cores per socket, @2.5GHz)
 * SAS1
 * LSI SAS1068E-based controller, flashed to HBA mode
 * 2* 73GB 15k 2.5"
 * FC
 * FC-HBA Emulex Zephyr-X 2*4G
 * 10:00:00:00:c9:77:e3:90 -> sw1-1 p2
 * 10:00:00:00:c9:77:e3:91 -> free
 * FC-HBA QLogic QLE2460 1* 4G
 * 21:00:00:1b:32:09:68:b2 -> sw2-1 p1
 * storage
 * OS on ZFS, mirror of 2* 73GB
 * imported LUNs from storage0 via redundant FC
 * multipathd

out-of-band management
irmc-compute1.hateotu.de 10.204.3.226 with KVM license

= compute2.hateotu.de =
 * Fujitsu RX300 S4

compute2.hateotu.de 10.204.3.222 Proxmox VE 6 SSH @ 22, HTTPS @ 8006


 * 48GB memory, 12* 4GB DDR2 FB
 * 2* E5420 (4+0 cores per socket, @2.5GHz)
 * SAS1
 * LSI SAS1068E-based controller, flashed to HBA mode
 * 2* 73GB 15k 2.5"
 * FC
 * FC-HBA Emulex Zephyr-X 2*4G
 * 10:00:00:00:c9:7c:95:0a -> sw1-1 p3
 * 10:00:00:00:c9:7c:95:09 -> free
 * FC-HBA QLogic QLE2460 1* 4G
 * 21:00:00:1b:32:89:c4:62 -> sw2-1 p2


 * storage
 * OS on ZFS, mirror of 2* 73GB
 * imported LUNs from storage0 via redundant FC
 * multipathd

out-of-band management
irmc-compute2.hateotu.de 10.204.3.227 with KVM license

= storage0.hateotu.de =
 * Fujitsu FibreCAT SX80

limitiert auf ~2.1TB pro Festplatte

862820-0807D5276D 500C0FF0D52C7A3C 500C0FF0DA6C263C 500C0FF0DA69103C storage0.hateotu.de 10.204.3.223 00:c0:ff:d5:27:6d HTTPS @ 443


 * master shelf + 2* disk shelf, 12 FC-HDDs (via interposer) 3.5" each
 * 18* 146GB
 * 18* 450GB


 * RAIDs
 * vdisk146_0: 8*146GB RAID6 -> ~880GB
 * vdisk146_1: 8*146GB RAID6 -> ~880GB
 * + global spares: 2* 146GB
 * vdisk450_0: 8*450GB RAID6 -> ~2700GB
 * vdisk450_1: 8*450GB RAID6 -> ~2700GB
 * + global spares: 2* 600GB // several 600GB disks were installed to replace broken 450GB disks, some also added as hot spares --carwe
 * SUM 7160 GB / 6,99 TB = 6,52 TiB


 * LUNs exported via FC & merged via multipathd, then configured for shared LVM

Info 	2020-12-22 18:08:51 	58 A1893432 	Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(01h,18h,08h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:0h SnsKeySpc:800096h Recovered Error

Info 	2020-01-19 18:46:55 	58	A1887662 Disk detected error (Channel:0 ID:37 SN:3QQ1WFJ500009004YAXC Encl:2 Slot:5) Key,Code,Qual=(01h,18h,01h) cdb:Rd 00046480 0080 Info:0004648eh CmdSpc:0h FRU:0h SnsKeySpc:800039h Recovered Error recovered data with error corr. & retries applied

Info 	2021-10-13 07:40:37 58	A1897520 	Disk detected error (Channel:0 ID:37 SN:3QQ1WFJ500009004YAXC Encl:2 Slot:5) Key,Code,Qual=(01h,17h,02h) cdb:Rd 0013fa80 0080 Info:0013faa4h CmdSpc:0h FRU:0h SnsKeySpc:800010h Recovered Error recovered data with positive head offset

enc 2 slot 2 3QQ12XVN00009004UMLZ hat auch schon mal gemeckert

Info 	2021-01-06 00:15:30 58	A1893622 	Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(03h,11h,00h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:81h SnsKeySpc:800096h Medium Error unrecovered read error Warning 	2021-01-06 00:15:35 8	A1893626 	Vdisk vdisk450_1 drive down (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Info 	2021-01-06 00:15:36 9	A1893629 	Spare kicked in (Channel:0 ID:22, SN:3QQ1T3QC00009004Y5MH Encl:1 Slot:6) for critical Vdisk (Vdisk: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000) Info 	2021-01-06 00:15:36 37	A1893630 	Vdisk reconstruct started (Vdisk: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000) drive: Channel:0 ID:22 SN:3QQ1T3QC00009004Y5MH Encl:1 Slot:6

Info 	2021-01-09 00:18:40 58	A1893649 	Disk detected error (Channel:0 ID:16 SN:3QQ1T1DG00009004YC2Y Encl:1 Slot:0) Key,Code,Qual=(01h,18h,01h) cdb:Rd 0130cf80 0080 Info:0130cfaah CmdSpc:0h FRU:1h SnsKeySpc:800037h Recovered Error recovered data with error corr. & retries applied

Warning 	2021-01-10 17:58:40	8	A1893656	Vdisk vdisk450_0 drive down (Channel:0 ID:32 SN:3QQ1T1SB00009003QT8Z Encl:2 Slot:0) Critical	2021-01-10 17:58:40	314	A1893657	FRU type: drive, problem: encl 2 deviceID 32. Vendor: SEAGAT Product ID: ST3450856SS, S/N: 3QQ1T1SB00009003QT8Z rev: 0006. Related event ID: 1893656, type: 8 Warning 	2021-01-10 17:58:40	1	A1893658	Vdisk critical: vdisk450_0, SN: 00c0ffd5276d00480c9a235e00000000 Critical	2021-01-10 17:58:41	207	A1893659	Vdisk scrub job failed. Command failed (error code: 1) (number of errors found: 0) (vdisk: vdisk450_0, SN: 00c0ffd5276d00480c9a235e00000000) Warning 	2021-01-10 18:02:52	18	A1893665	Vdisk reconstruct failed. Command failed (error code 1). (Vdisk: vdisk450_0, SN: 00c0ffd5276d00480c9a235e00000000) Warning 	2021-01-10 18:02:53	78	A1893666	Spare drive unusable (too small) for Vdisk: vdisk450_0, SN: 00c0ffd5276d00480c9a235e00000000

[...] Info 	2021-01-10 18:02:52 59	A1893664 	Disk channel error (Channel:0 ID:34 SN:3QQ12XVN00009004UMLZ Encl:2 Slot:2): I/O Timeout cdb:10 additional Warning 	2021-01-10 18:02:52 18	A1893665 	Vdisk reconstruct failed. Command failed (error code 1). (Vdisk: vdisk450_0, SN: 00c0ffd5276d00480c9a235e00000000)

Warning 	2021-04-05 14:15:56		58		A1895145		Disk detected error (Channel:0 ID:16 SN:3QQ1T1DG00009004YC2Y Encl:1 Slot:0) Key,Code,Qual=(04h,15h,01h) cdb:Rd 000000e2 0004 Info:000000e2h CmdSpc:0h FRU:83h SnsKeySpc:802049h Hardware mechanical positioning error

Info 	2021-07-07 03:10:43 58	A1896752 	Disk detected error (Channel:0 ID:7 SN:3LN4L5LV00009834Q4PS Encl:0 Slot:7) Key,Code,Qual=(01h,17h,01h) cdb:Rd 00591d00 0080 Info:00591d68h CmdSpc:0h FRU:0h SnsKeySpc:800031h Recovered Error recovered data with retries

Info 	2021-07-04 16:13:11 59	A1896701 	Disk channel error (Channel:0 ID:38 SN:3QQ037RW00009004TVCN Encl:2 Slot:6): I/O Timeout cdb:Rd 135c5b00 0080

Info 	2021-07-08 01:10:57 58	A1896809 	Disk detected error (Channel:0 ID:40 SN:3QQ1DZ8700009004VXWF Encl:2 Slot:8) Key,Code,Qual=(01h,17h,01h) cdb:Rd 00017780 0080 Info:000177a1h CmdSpc:0h FRU:0h SnsKeySpc:800002h Recovered Error recovered data with retries

Info 	2021-09-15 13:14:07 58	A1897323 	Disk detected error (Channel:0 ID:39 SN:3QQ1T1CC00009004Y5DH Encl:2 Slot:7) Key,Code,Qual=(01h,17h,01h) cdb:Rd 0002d200 0080 Info:0002d275h CmdSpc:0h FRU:0h SnsKeySpc:800003h Recovered Error recovered data with retries

Disk drive (Channel:0 ID:21 SN: Encl:1 Slot:5) reported a SMART event sense key:Recovered Error(01h) ASC:5Dh ASCQ:00h failure prediction threshold exceeded Info:00000000

Disk drive (Channel:0 ID:39 SN: Encl:2 Slot:7) reported a SMART event sense key:Recovered Error(01h) ASC:5Dh ASCQ:00h failure prediction threshold exceeded Info:00000000

Warning 	2022-03-09 12:24:53		58		A1898846		Disk detected error (Channel:0 ID:39 SN:3QQ1T1CC00009004Y5DH Encl:2 Slot:7) Key,Code,Qual=(04h,32h,00h) cdb:Rd 0002a880 0080 Info:0002a89ch CmdSpc:0h FRU:9dh SnsKeySpc:800096h Hardware no defect spare location available

Info 	2022-03-27 13:29:51 	58	A1899000 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,17h,EFh) cdb:Rd 2ccf8100 0080 Info:2ccf817fh CmdSpc:0h FRU:0h SnsKeySpc:800000h Recovered Error

Info 	2022-04-03 18:01:19 	58	A1899051 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,17h,02h) cdb:Rd 2caee780 0080 Info:2caee7adh CmdSpc:0h FRU:0h SnsKeySpc:800004h Recovered Error recovered data with positive head offset

Info 	2022-04-03 18:01:17 	58	A1899050 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,18h,00h) cdb:Rd 2caee180 0080 Info:2caee1a0h CmdSpc:0h FRU:0h SnsKeySpc:800001h Recovered Error recovered data with error correction applied

Info 	2022-04-04 20:24:25 	58	A1899060 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,17h,03h) cdb:Rd 136d3000 0080 Info:136d3014h CmdSpc:0h FRU:0h SnsKeySpc:800004h Recovered Error recovered data with negative head offset

Info 	2022-04-04 20:24:22 	58	A1899059 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,17h,01h) cdb:Rd 136d0d00 0080 Info:136d0d3eh CmdSpc:0h FRU:0h SnsKeySpc:800002h Recovered Error recovered data with retries

Info 	2022-04-04 23:00:13 	58	A1899063 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,18h,A0h) cdb:Rd 2d432b00 0080 Info:2d432b27h CmdSpc:0h FRU:0h SnsKeySpc:80000ch Recovered Error

Info 	2022-04-04 22:51:22 	58	A1899062 	Disk detected error (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) Key,Code,Qual=(01h,17h,03h) cdb:Rd 2c4d8e00 0080 Info:2c4d8e06h CmdSpc:0h FRU:0h SnsKeySpc:800008h Recovered Error recovered data with negative head offset

Info 	2022-04-19 13:05:17 	58	A1899166 	Disk detected error (Channel:0 ID:7 SN:3LN4L5LV00009834Q4PS Encl:0 Slot:7) Key,Code,Qual=(01h,17h,01h) cdb:Rd 0bf9d180 0080 Info:0bf9d18eh CmdSpc:0h FRU:0h SnsKeySpc:800032h Recovered Error recovered data with retries

Info 	2022-04-24 23:15:51 	58	A1899212 	Disk detected error (Channel:0 ID:7 SN:3LN4L5LV00009834Q4PS Encl:0 Slot:7) Key,Code,Qual=(01h,17h,01h) cdb:Rd 0bf9d180 0080 Info:0bf9d18eh CmdSpc:0h FRU:0h SnsKeySpc:800032h Recovered Error recovered data with retries

Info 	2022-05-08 05:19:24 	58	A1899309 	Disk detected error (Channel:0 ID:40 SN:3QQ1DZ8700009004VXWF Encl:2 Slot:8) Key,Code,Qual=(01h,18h,01h) cdb:Rd 00029b80 0080 Info:00029bf2h CmdSpc:0h FRU:1h SnsKeySpc:80000ah Recovered Error recovered data with error corr. & retries applied

Info 	2022-05-08 05:19:12 	58	A1899308 	Disk detected error (Channel:0 ID:40 SN:3QQ1DZ8700009004VXWF Encl:2 Slot:8) Key,Code,Qual=(01h,17h,02h) cdb:Rd 00018180 0080 Info:000181a6h CmdSpc:0h FRU:0h SnsKeySpc:800010h Recovered Error recovered data with positive head offset

Warning 	2022-06-11 19:55:53 	55	A1899596 	Disk drive (Channel:0 ID:34 SN:EA09PB80A1TS Encl:2 Slot:2) reported a SMART event sense key:Recovered Error(01h) ASC:5Dh ASCQ:00h failure prediction threshold exceeded Info:00000000

Critical 	2022-06-11 18:50:52 	314	A1899595 	FRU type: drive, problem: encl 2 deviceID 34. Vendor: IBM-ES Product ID: MBF2600RC, S/N: EA09PB80A1TS rev: SB2F. Related event ID: 1899594, type: 55

= fcsw01.hateotu.de = fcsw01.hateotu.de 10.204.3.230 00:27:f8:81:ee:a6
 * Brocade 300, 8*8G FC licenced

= fcsw02.hateotu.de = fcsw02.hateotu.de 10.204.3.229 00:27:f8:81:ff:ae
 * Brocade 300, 8*8G FC licenced

installed the two missing iRMC-KVM-lics which got sponsored by Fujitsu --Carwe (talk) 18:47, 20 December 2020 (CET)