One Identity Starling - Safeguard remote access (All Regions)

Incident Report for One Identity Starling

Postmortem

What Occurred?

Safeguard Remote Access (SRA) sessions were closing unexpectedly in the EU region.

What went wrong and why?

SRA RDP/SSH sessions in the EU were disconnecting when Azure Kubernetes nodes began running out of WebSockets.  In certain scenarios, unused sockets were not being closed correctly resulting in WebSocket exhaustion and subsequent disconnections.

How are we making incidents like this less likely or less impactful?

We are implementing a more robust WebSocket management solution and increasing SRA's logging to proactively identify and prevent this from recurring.  These improvements will increase reliability and mitigate against future occurrences in addition to improving future troubleshooting across the platform.

Posted Mar 05, 2025 - 12:18 PST

Resolved

The aforementioned fix has been deployed and the issue is now resolved.
Posted Feb 27, 2025 - 14:15 PST

Update

All services remain fully functional.
The formal fix scheduled for deployment will require an additional 48 hours of quality control testing.
Status will be updated upon completion.
Posted Feb 25, 2025 - 06:06 PST

Update

We continue to monitor the overall performance of our services which are fully functional.

We will be deploying a formal fix within the next 7 days which is expected to fully resolve the issue and prevent future occurrences.
Posted Feb 18, 2025 - 04:16 PST

Monitoring

Mitigations are in place and sessions are working normally.

We are currently monitoring all sessions in the EU and US to ensure continued operation, and we will follow up further when possible.
Posted Feb 14, 2025 - 04:24 PST

Update

We are still investigating this issue.

A further update will be provided on Friday 2025-02-14
Posted Feb 13, 2025 - 08:10 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 13, 2025 - 06:35 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 11, 2025 - 01:43 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 10, 2025 - 01:38 PST

Update

We are still investigating this issue.

A further update will be provided on Monday 2025-02-10
Posted Feb 07, 2025 - 09:05 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 07, 2025 - 02:21 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 05, 2025 - 03:13 PST

Update

We are still investigating this issue.

A further update will be provided on Wednesday 2025-02-05
Posted Feb 04, 2025 - 10:34 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 04, 2025 - 00:56 PST

Update

We are still investigating this issue.

A further update will be provided on Tuesday 2025-02-04
Posted Feb 03, 2025 - 08:34 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT
Posted Feb 03, 2025 - 02:41 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT.
Posted Jan 31, 2025 - 04:23 PST

Update

We are still investigating this issue.

A further update will be provided at on Wednesday 2025-01-31
Posted Jan 30, 2025 - 05:59 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT.
Posted Jan 29, 2025 - 01:27 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Wednesday 2025-01-29
Posted Jan 28, 2025 - 08:46 PST

Update

We are continuing to investigate this issue.
Posted Jan 28, 2025 - 03:20 PST

Update

We are continuing to investigate this issue.
Posted Jan 28, 2025 - 03:19 PST

Update

We are continuing to investigate this issue.
Posted Jan 28, 2025 - 03:18 PST

Update

We are continuing to investigate this issue.
Posted Jan 28, 2025 - 03:17 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT.
Posted Jan 28, 2025 - 01:37 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 16:00 UTC/08:00 PDT.
Posted Jan 27, 2025 - 02:01 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Monday 2025-01-27
Posted Jan 24, 2025 - 07:17 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 15:00 UTC/07:00 PDT.
Posted Jan 24, 2025 - 04:59 PST

Update

We are continuing to investigate this issue.

A further update will be provided at 13:00 UTC/05:00 PDT.
Posted Jan 24, 2025 - 02:49 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Friday 2025-01-24
Posted Jan 23, 2025 - 07:03 PST

Update

We are still investigating this issue.

A further update will be provided at 08:00 PST/16:00 UTC
Posted Jan 23, 2025 - 04:04 PST

Update

We are still investigating this issue.

A further update will be provided at 05:00 PST/13:00 UTC
Posted Jan 23, 2025 - 01:04 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Thursday 2025-01-23
Posted Jan 22, 2025 - 09:29 PST

Update

We are still investigating this issue.

A further update will be provided at 07:00 PST/15:00 UTC
Posted Jan 22, 2025 - 03:24 PST

Update

We are still investigating this issue.

A further update will be provided at 05:00 PST/13:00 UTC
Posted Jan 22, 2025 - 01:02 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Wednesday 2025-01-22
Posted Jan 21, 2025 - 08:08 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00 PST/17:00 UTC
Posted Jan 21, 2025 - 06:17 PST

Update

We are still investigating this issue.

A further update will be provided at 07:00 PST/15:00 UTC
Posted Jan 21, 2025 - 03:35 PST

Update

We are still investigating this issue.

A further update will be provided at 05:00 PST/13:00 UTC
Posted Jan 21, 2025 - 01:06 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Tuesday 2025-01-21
Posted Jan 20, 2025 - 08:35 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00 PST/17:00 UTC
Posted Jan 20, 2025 - 05:09 PST

Update

We are still investigating this issue.

A further update will be provided at 07:00 PST/15:00 UTC
Posted Jan 20, 2025 - 03:06 PST

Update

We are still investigating this issue.

A further update will be provided at 05:00 PST/13:00 UTC
Posted Jan 20, 2025 - 01:04 PST

Update

We are still investigating this issue.

A further update will be provided at 09:00UTC/01:00PST on Monday 2025-01-20
Posted Jan 17, 2025 - 07:08 PST

Update

We are still investigating this issue.

A further update will be provided at 07:00 PST/15:00 UTC
Posted Jan 17, 2025 - 04:58 PST

Update

We are still investigating this issue.

A further update will be provided at 05:00 PST/13:00 UTC
Posted Jan 17, 2025 - 03:00 PST

Update

We are still investigating this issue.

A further update will be provided at 03:00 PST/11:00 UTC
Posted Jan 17, 2025 - 01:03 PST

Update

We are still investigating this issue.

Sessions are impacted, however downloading the RDP file and connecting via RDP is still a viable workaround.

A further update will be provided at 01:00 PST/09:00 UTC
Posted Jan 16, 2025 - 07:08 PST

Update

We are still investigating this issue. We will provide further information at 08:00 PST
Posted Jan 16, 2025 - 04:51 PST

Investigating

We are currently experiencing an partial outage regarding SRA sessions closing unexpectedly after some time.

A workaround is possible by downloading the RDP file and connecting locally, rather than starting the session from SRA.

We are working on fixing the issue. Further updates will be provided at 05:46 PST.
Posted Jan 16, 2025 - 03:34 PST
This incident affected: One Identity Starling EMEA (Remote Access), One Identity Starling NA (Remote Access), and One Identity Starling (Remote Access).