Compute-Cluster: Difference between revisions

From HateotU
Line 141: Line 141:
** SUM 7160 GB / 6,99 TB = 6,52 TiB
** SUM 7160 GB / 6,99 TB = 6,52 TiB


* LUNs exported via FC
* LUNs exported via FC & merged via multipathd, then configured for shared LVM




<pre>
<pre>
Info 2020-12-22 18:08:51 58
A1893432 Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(01h,18h,08h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:0h SnsKeySpc:800096h Recovered Error
Info 2020-01-19 18:46:55 58 A1887662
Info 2020-01-19 18:46:55 58 A1887662
Disk detected error (Channel:0 ID:37 SN:3QQ1WFJ500009004YAXC Encl:2 Slot:5) Key,Code,Qual=(01h,18h,01h) cdb:Rd 00046480 0080 Info:0004648eh CmdSpc:0h FRU:0h SnsKeySpc:800039h Recovered Error recovered data with error corr. & retries applied  
Disk detected error (Channel:0 ID:37 SN:3QQ1WFJ500009004YAXC Encl:2 Slot:5) Key,Code,Qual=(01h,18h,01h) cdb:Rd 00046480 0080 Info:0004648eh CmdSpc:0h FRU:0h SnsKeySpc:800039h Recovered Error recovered data with error corr. & retries applied  




Info 2020-12-22 18:08:51 58
  enc 2 slot 2 3QQ12XVN00009004UMLZ hat auch schon mal gemeckert
A1893432 Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(01h,18h,08h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:0h SnsKeySpc:800096h Recovered Error  
 
 
Info 2021-01-06 00:15:30
58
A1893622 Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(03h,11h,00h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:81h SnsKeySpc:800096h Medium Error unrecovered read error
Warning 2021-01-06 00:15:35
8
A1893626 Vdisk vdisk450_1 drive down (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3)
Info 2021-01-06 00:15:36
9
A1893629 Spare kicked in (Channel:0 ID:22, SN:3QQ1T3QC00009004Y5MH Encl:1 Slot:6) for critical Vdisk (Vdisk: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000)
Info 2021-01-06 00:15:36
37
A1893630 Vdisk reconstruct started (Vdisk: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000) drive: Channel:0 ID:22 SN:3QQ1T3QC00009004Y5MH Encl:1 Slot:6
 


  enc 2 slot 2 3QQ12XVN00009004UMLZ hat auch schon mal gemeckert




Warning 2021-01-06 00:15:35 8 A1893626 Vdisk vdisk450_1 drive down (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3)
Info 2021-01-09 00:18:40
Critical 2021-01-06 00:15:35 314 A1893627 FRU type: drive, problem: encl 1 deviceID 19. Vendor: SEAGAT Product ID: ST3450856SS , S/N: 3QQ1T0H800009004Y70Q rev: 0006. Related event ID: 1893626, type: 8
58
Warning 2021-01-06 00:15:35 1 A1893628 Vdisk critical: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000
A1893649 Disk detected error (Channel:0 ID:16 SN:3QQ1T1DG00009004YC2Y Encl:1 Slot:0) Key,Code,Qual=(01h,18h,01h) cdb:Rd 0130cf80 0080 Info:0130cfaah CmdSpc:0h FRU:1h SnsKeySpc:800037h Recovered Error recovered data with error corr. & retries applied


</pre>
</pre>

Revision as of 02:16, 9 January 2021


compute0.hateotu.de

  • Fujitsu RX300 S4
compute0.hateotu.de
10.204.3.220

Proxmox VE 6
SSH @ 22, HTTPS @ 8006
  • 48GB memory, 12* 4GB DDR2 FB
  • 2* E5420 (4+0 cores per socket, @2.5GHz)
  • SAS1
    • LSI SAS1068E-based controller, flashed to HBA mode
    • 2* 73GB 15k 2.5"
  • FC
    • FC-HBA Emulex Zephyr-X 2*4G
      • 10:00:00:00:c9:77:e8:6c -> sw1-1 p1
      • 10:00:00:00:c9:77:e8:6d -> free
    • FC-HBA QLogic QLE2460 1* 4G
      • 21:00:00:1b:32:09:b9:b6 -> sw2-1 p3
  • storage
    • OS on ZFS, mirror of 2* 73GB
    • imported LUNs from storage0 via redundant FC
      • multipathd



out-of-band management

irmc-compute0.hateotu.de
10.204.3.225

with KVM license


compute1.hateotu.de

  • Fujitsu RX300 S4
compute1.hateotu.de
10.204.3.221

Proxmox VE 6
SSH @ 22, HTTPS @ 8006
  • 48GB memory, 12* 4GB DDR2 FB
  • 2* E5420 (4+0 cores per socket, @2.5GHz)
  • SAS1
    • LSI SAS1068E-based controller, flashed to HBA mode
    • 2* 73GB 15k 2.5"
  • FC
    • FC-HBA Emulex Zephyr-X 2*4G
      • 10:00:00:00:c9:77:e3:90 -> sw1-1 p2
      • 10:00:00:00:c9:77:e3:91 -> free
    • FC-HBA QLogic QLE2460 1* 4G
      • 21:00:00:1b:32:09:68:b2 -> sw2-1 p1
  • storage
    • OS on ZFS, mirror of 2* 73GB
    • imported LUNs from storage0 via redundant FC
      • multipathd


out-of-band management

irmc-compute1.hateotu.de
10.204.3.226

with KVM license



compute2.hateotu.de

  • Fujitsu RX300 S4
compute2.hateotu.de
10.204.3.222

Proxmox VE 6
SSH @ 22, HTTPS @ 8006
  • 48GB memory, 12* 4GB DDR2 FB
  • 2* E5420 (4+0 cores per socket, @2.5GHz)
  • SAS1
    • LSI SAS1068E-based controller, flashed to HBA mode
    • 2* 73GB 15k 2.5"
  • FC
    • FC-HBA Emulex Zephyr-X 2*4G
      • 10:00:00:00:c9:7c:95:0a -> sw1-1 p3
      • 10:00:00:00:c9:7c:95:09 -> free
    • FC-HBA QLogic QLE2460 1* 4G
      • 21:00:00:1b:32:89:c4:62 -> sw2-1 p2


  • storage
    • OS on ZFS, mirror of 2* 73GB
    • imported LUNs from storage0 via redundant FC
      • multipathd


out-of-band management

irmc-compute2.hateotu.de
10.204.3.227

with KVM license



storage0.hateotu.de

  • Fujitsu FibreCAT SX80
862820-0807D5276D
500C0FF0D52C7A3C
500C0FF0DA6C263C
500C0FF0DA69103C

storage0.hateotu.de
10.204.3.223
00:c0:ff:d5:27:6d
HTTPS @ 443
  • master shelf + 2* disk shelf, 12 FC-HDDs (via interposer) 3.5" each
    • 18* 146GB
    • 18* 450GB
  • RAIDs
    • vdisk146_0: 8*146GB RAID6 -> ~880GB
    • vdisk146_1: 8*146GB RAID6 -> ~880GB
    • + global spares: 2* 146GB
    • vdisk450_0: 8*450GB RAID6 -> ~2700GB
    • vdisk450_1: 8*450GB RAID6 -> ~2700GB
    • + global spares: 2* 450GB
    • SUM 7160 GB / 6,99 TB = 6,52 TiB
  • LUNs exported via FC & merged via multipathd, then configured for shared LVM


Info 	2020-12-22 18:08:51 	58
	A1893432 	Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(01h,18h,08h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:0h SnsKeySpc:800096h Recovered Error 


Info 	2020-01-19 18:46:55 	58	A1887662 	
Disk detected error (Channel:0 ID:37 SN:3QQ1WFJ500009004YAXC Encl:2 Slot:5) Key,Code,Qual=(01h,18h,01h) cdb:Rd 00046480 0080 Info:0004648eh CmdSpc:0h FRU:0h SnsKeySpc:800039h Recovered Error recovered data with error corr. & retries applied 


   enc 2 slot 2 3QQ12XVN00009004UMLZ hat auch schon mal gemeckert


Info 	2021-01-06 00:15:30 	
58
	A1893622 	Disk detected error (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) Key,Code,Qual=(03h,11h,00h) cdb:Rd 0005bd80 0080 Info:0005bde8h CmdSpc:0h FRU:81h SnsKeySpc:800096h Medium Error unrecovered read error 
Warning 	2021-01-06 00:15:35 	
8
	A1893626 	Vdisk vdisk450_1 drive down (Channel:0 ID:19 SN:3QQ1T0H800009004Y70Q Encl:1 Slot:3) 
Info 	2021-01-06 00:15:36 	
9
	A1893629 	Spare kicked in (Channel:0 ID:22, SN:3QQ1T3QC00009004Y5MH Encl:1 Slot:6) for critical Vdisk (Vdisk: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000) 
Info 	2021-01-06 00:15:36 	
37
	A1893630 	Vdisk reconstruct started (Vdisk: vdisk450_1, SN: 00c0ffd5276d0048749a235e00000000) drive: Channel:0 ID:22 SN:3QQ1T3QC00009004Y5MH Encl:1 Slot:6 




Info 	2021-01-09 00:18:40 	
58
	A1893649 	Disk detected error (Channel:0 ID:16 SN:3QQ1T1DG00009004YC2Y Encl:1 Slot:0) Key,Code,Qual=(01h,18h,01h) cdb:Rd 0130cf80 0080 Info:0130cfaah CmdSpc:0h FRU:1h SnsKeySpc:800037h Recovered Error recovered data with error corr. & retries applied 

fcsw01.hateotu.de

  • Brocade 300, 8*8G FC licenced
fcsw01.hateotu.de
10.204.3.230
00:27:f8:81:ee:a6


fcsw02.hateotu.de

  • Brocade 300, 8*8G FC licenced
fcsw02.hateotu.de
10.204.3.229
00:27:f8:81:ff:ae


installed the two missing iRMC-KVM-lics which got sponsored by Fujitsu --Carwe (talk) 18:47, 20 December 2020 (CET)