# Checking AWS ECS Events for 500 Errors

## Background

When applicants report 500 errors or you notice them in ECS logs, you may see errors like:

```
SQLTransientConnectionException: HikariPool-default - Connection is not available, request timed out after 30000ms.
```

This usually means a thread timed out while trying to reach the database.

When this happens, ECS may mark the container unhealthy because it can no longer reach the database, then replace the task.

Possible causes include a heavy database query (for example, during a large application export), problems with the RDS instance, networking issues within the VPC, or other factors that are still being investigated by the CiviForm team. The exact cause is not always clear from ECS events alone.

This guide walks through how to check ECS Events for task replacements that coincide with these errors. For how to find and filter application logs in ECS, see [Finding and Filtering ECS Logs](/governance-and-management/technical-support/finding-and-filtering-ecs-logs.md).

## How to Check ECS Events

1. Sign in to the [AWS Console](https://console.aws.amazon.com/) and select the correct account and region from the top-right dropdown.
2. Go to **Amazon Elastic Container Service** → **Clusters**.
3. Click your CiviForm cluster (e.g., `prod-civiform`).
4. Under the **Services** tab, click your service (e.g., `prod-civiform-service`).
5. Select the **Events** tab.
6. Filter the date range to cover the time window when errors were reported.

If you've landed in the right place, you should see the **Events** tab for your service with a filterable list of events, like this:

![ECS Events tab for your CiviForm service](/files/MhIJh3ByrUYSO07dFfc1)

## What to Look For

Look for a line like `Amazon ECS replaced 1 tasks due to an unhealthy status`. This means ECS replaced a container that could no longer pass health checks, typically because it had stopped connecting to the database.

During a planned deployment, you will see tasks start and stop, targets deregister, connections drain, etc, which is normal behavior.

This is an example of what the events will look like during a normal deployment. Note there is no `replaced ... due to an unhealthy status` line:

![ECS Events during a normal deployment — deployment completed and steady state, with no unhealthy replacement message](/files/Jq3zSoVsGSYp4P2EOl3P)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.civiform.us/governance-and-management/technical-support/ecs-events-checking.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
