CR1 DataCenter San Pedro colocation room 4 down time

Incident Report for RackNation

Resolved

Layer2 issues within the CR1 DataCenter were resolved completely, we consider this event as completely resolved.

Los eventos de capa2 que generaron este evento mayor han sido resueltos, consideramos el evento como resuelto.

Posted Oct 05, 2023 - 00:10 CST

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Oct 04, 2023 - 14:35 CST

Update

We are continuing to work on a fix for this issue.

Posted Oct 04, 2023 - 11:43 CST

Identified

Dear clients we are aware clients that utilize our VPS compute platform in CR1 San Pedro are still affected by the ongoing situation, we are working to resolve this ASAP.

Estimados clientes, estamos al tanto nuestros clientes de la plataforma VPS Computes en CR1 San Pedro siguen afectados en su funcionalidad, estamos trabajando para resolver esto lo antes posible.

Posted Oct 04, 2023 - 11:40 CST

Monitoring

(Versión en Español abajo)

Dear Valued Clients,

We want to take a moment to address the recent downtime that occurred on October 4 at 9:51am in our datacenter located in San Pedro CR1. We understand the impact this may have had on your operations and would like to sincerely apologize for any inconvenience caused.

During this incident, we experienced a total downtime of 1 hour and 4 minutes. The root cause of the issue was a disruption in our layer 2 network between server room 4 and the rest of the datacenter. This disruption set off a domino effect, resulting in the downtime experienced across our services.

We immediately initiated our response protocols to resolve the situation promptly. Our team has successfully recreated an LACP (Link Aggregation Control Protocol) bond between colocation area 4 and our core network. This solution was implemented to restore connectivity and mitigate the downtime.

We want to assure you that our team is actively monitoring the situation to ensure the stability and reliability of the recreated LACP bond. By closely monitoring the network, we can identify any potential underlying causes or recurring issues that may have contributed to the initial downtime. This proactive approach allows us to take the necessary measures to prevent similar incidents from happening in the future and maintain a seamless experience for our esteemed clients.

We understand the critical importance of uninterrupted service for your business operations and want to assure you that resolving this issue and providing a dependable network infrastructure remain our top priorities. Our dedicated team is working diligently to prevent any further disruptions and ensure a smooth and reliable service moving forward.

We apologize once again for any inconvenience caused by this downtime. If you have any questions, concerns, or require additional information, please do not hesitate to reach out to our support team. They are available around the clock to assist you and provide any necessary updates.

Thank you for your understanding and patience. We greatly value your partnership and trust in us.

Best regards,

Infrastructure Team

(Español)

Estimados Clientes,

Queremos tomarnos un momento para abordar la reciente interrupción que ocurrió el 4 de octubre a las 9:51 am en nuestro centro de datos ubicado en San Pedro CR1. Entendemos el impacto que esto pudo haber tenido en sus operaciones y nos disculpamos sinceramente por cualquier inconveniente causado.

Durante este incidente, experimentamos un tiempo de inactividad total de 1 hora y 4 minutos. La causa raíz del problema fue una interrupción en nuestra red de capa 2 entre la sala de servidores 4 y el resto del centro de datos. Esta interrupción desencadenó un efecto dominó, lo que resultó en el tiempo de inactividad experimentado en nuestros servicios.

Inmediatamente pusimos en marcha nuestros protocolos de respuesta para resolver la situación rápidamente. Nuestro equipo ha recreado con éxito un enlace LACP (Protocolo de Control de Agregación de Enlaces) entre el área de colocación 4 y nuestra red principal. Esta solución se implementó para restablecer la conectividad y mitigar el tiempo de inactividad.

Queremos asegurarles que nuestro equipo está monitoreando activamente la situación para garantizar la estabilidad y confiabilidad del enlace LACP recreado. Al monitorear de cerca la red, podemos identificar posibles causas subyacentes o problemas recurrentes que puedan haber contribuido al tiempo de inactividad inicial. Este enfoque proactivo nos permite tomar las medidas necesarias para prevenir incidentes similares en el futuro y mantener una experiencia fluida para nuestros estimados clientes.

Entendemos la importancia crítica de un servicio ininterrumpido para las operaciones de su negocio y queremos asegurarles que resolver este problema y proporcionar una infraestructura de red confiable siguen siendo nuestras principales prioridades. Nuestro equipo dedicado está trabajando diligentemente para evitar más interrupciones y garantizar un servicio fluido y confiable a partir de ahora.

Nos disculpamos una vez más por cualquier inconveniente causado por este tiempo de inactividad. Si tienen alguna pregunta, inquietud o necesitan información adicional, no duden en comunicarse con nuestro equipo de soporte. Están disponibles las 24 horas del día para ayudarlos y proporcionar cualquier actualización necesaria.

Gracias por su comprensión y paciencia. Valoramos enormemente su asociación y confianza en nosotros.

Saludos cordiales,

Racknation S.A.

Posted Oct 04, 2023 - 11:19 CST

This incident affected: VPS Computes en CR1.