Duration: 11:01 PM – 11:08 PM (Europe/Berlin)

Summary:

At 11:01 PM, one of our uninterruptible power supplies (UPS) failed, preventing a secondary network switch from engaging correctly. This caused a complete network outage to the validator’s endpoint. While the node software remained running, it lost synchronization and could not participate in consensus during the outage.

Impact:

  • Validator endpoint unreachable for ~7 minutes
  • Node unsynchronized and unable to join consensus until network restored
  • No data loss; all blocks were processed once connectivity returned
  • No reduction in pool rewards for delegators

Root Cause:

A UPS unit did not transfer power as intended, causing the attached switch to lose connectivity and isolate the validator from the network.

Resolution:

  • Faulty UPS was bypassed, and power restored to the switch by 11:08 PM
  • Network connectivity resumed, and the node automatically synced the missed ledger state
  • Consensus participation recommenced immediately after sync completion
  • Full infrastructure health check verified no residual issues

Next Steps / Preventative Measures:

  • Replace the failed UPS with a certified spare unit.
  • Enhance UPS failover and switch-over testing procedures.
  • Schedule quarterly failover drills to validate network redundancy.

We apologize for the interruption and remain dedicated to ensuring the highest level of reliability for our validator services.

Next Post