Changes between Version 3 and Version 4 of Internal/NodeFailureModes


Ignore:
Timestamp:
Jan 30, 2009, 9:39:15 PM (15 years ago)
Author:
ssugrim
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Internal/NodeFailureModes

    v3 v4  
    1 == List of common Failure modes ==
     1{{{
     2#!rst
    23
    3 || Failure Mode || # of occurences || List of Occurences|| Solution - Notes ||
     4List of Common Node Failures
     5============================
    46
    5 || Disk Failure - Kernel throws write errors || Many || Fix Me || Change Disk ||
    67
    7 || Disk Failure - Not detected on POST || Many || Fix Me || Change Disk ||
    8 
    9 || Pxe Halt - Locks up during execution of PXE code || Many ||[1,5] || Change Node ? ||
    10 
    11 || First Power on Halt || 1 || [3,8] || Locks during the first attempt, posts during after reset, change node? ||
    12 
    13 || Dead Node ID box top LED (the blinking one)||1||[1,5]||Power Cycle fixed it, probably a rabbit issue. ||
     8+------------------------------------------------+----------+--------+------------------------------------+
     9| Failure Mode                                   |# of      | List of|Solution - Notes                    |
     10|                                                |occurences|        |                                    |
     11+================================================+==========+========+====================================+
     12|Pxe Halt - Locks up during execution of PXE code|1         |[1,5]   |- Multiple resets (more than 1)     |
     13|                                                |          |        |  may be required                   |
     14|                                                |          |        |- Might require node Change         |
     15+------------------------------------------------+----------+--------+------------------------------------+
     16|Dead Node ID box top LED (the blinking one)     |1         |[1,5]   |- Power cycle Fixed it              |
     17|                                                |          |        |- Rabbit Issue?                     |
     18+------------------------------------------------+----------+--------+------------------------------------+
     19|First Power on Halt                             |1         |[3,8]   |- Locks during the first attempt    |
     20|                                                |          |        |- Post after reset                  |
     21|                                                |          |        |- Change node?                      |
     22+------------------------------------------------+----------+--------+------------------------------------+
     23|Disk Failure - Not detected on POST             |Many      |Fix Me  |- Change disk                       |
     24+------------------------------------------------+----------+--------+------------------------------------+
     25|Disk Failure -  Kernel throws write errors      |Many      |Fix Me  |- Change disk                       |
     26+------------------------------------------------+----------+--------+------------------------------------+
     27}}}
     28