Troubleshooting Production
Outage, site is down
DNS
Symptoms
"This site can’t be reached" in Chrome
Requests timing out
Diagnose
On a unix command line, run dig <your CiviForm domain>
. (Note: Remove the protocol -- http
or https
-- from the front of your domain before running the command.) There should be a CNAME
entry that points to an AWS load balancer e.g. seattle-civiform-lb-2038295446.us-west-1.elb.amazonaws.com.
Confirm that the CNAME
record matches the public domain for your AWS application load balancer by visiting the AWS console EC2 > Load Balancing > Load Balancers and finding the load balancer for your prod deployment.
Resolution
If the CNAME
entry is missing or does not match the DNS name you find in AWS, add or update a CNAME
entry in your domain registrar with the application load balancer's DNS name.
Server can't start
Symptoms
"This site can’t be reached" in Chrome
Requests timing out
Diagnose
View the ECS cluster for your prod deployment in AWS by going to ECS > Clusters and clicking the cluster for your production deployment. There should be at least one healthy task. If all tasks are unhealthy or unknown the server is unable to start.
If no tasks are healthy, view the server logs (see Server errors below). Look for stack traces and error messages.
Resolution
Contact the CiviForm maintainers and include any errors you found in the server logs.
Server errors
Symptoms
Server returns 400 or 500 level errors or pages with short, plaintext messages stating an error message.
Resolution
Investigate the server logs. Report any errors you find along with complete stack traces to the CiviForm maintainers. To view the server logs in the AWS console go to CloudWatch > Logs > Log groups, select the log group for your production deployment and view the combined log stream.
Authentication errors
Symptoms
Users are unable to log in.
Resolution
Contact CiviForm maintainers with details of the investigation.
Errors related to Authority ID
NOTE: It should be strongly verified that the user/account is correct. Changing this without care is a security issue as now the "new" account has access to the system.
Last updated