Monitoring
This page describes how to monitor a Squash Orchestrator installation.
Logging
As is customary with Docker and Kubernetes deployments, all logs are generated in the console.
By default, the log level is INFO. You can customize that log level for all services
by defining the DEBUG_LEVEL environment variable. Possible values are NOTSET, DEBUG,
INFO (the default), WARNING, ERROR, and FATAL (from the most verbose to the least
verbose mode).
Info
You can fine-tune the desired log level per service. For more information please refer to Environment variables.
Watchdog
The Squash Orchestrator is a collection of services running together. It contains a built-in watchdog: if a service dies, the instance will die too.
When the watchdog is triggered, the failed services are listed in the log console.
Temporary Storage
A Squash Orchestrator deployment is stateless: it may include configuration files, but no data is generated on disk that would have to be preserved upon restart.
It does use temporary storage (typically on /tmp) to hold information on running and
recently completed workflows.
The retention policy can be configured as described in the Observer Service documentation.
By default, information related to a given workflow are kept for one hour after the workflow's completion or cancellation.
Liveness endpoint
The Squash Orchestrator receptionist
service provides a POST /workflows?ping
endpoint that returns a 200 code with a Pong! body if the orchestrator service is operational.
This endpoint requires a valid token that is allowed to create workflows.
curl -X POST \
-H "Authorization: Bearer ${TOKEN}" \
http://orchestrator.example.com/workflows?ping
curl -X POST ^
-H "Authorization: Bearer %TOKEN%" ^
http://orchestrator.example.com/workflows?ping
curl.exe -X POST `
-H "Authorization: Bearer $Env:TOKEN" `
http://orchestrator.example.com/workflows?ping
{
"apiVersion":"v1",
"kind":"Status",
"metadata":{},
"message":"Pong!",
"details":null,
"status":"Success",
"reason":"OK",
"code":200
}
Activity endpoint
GET /subscriptions
The Squash Orchestrator event bus
service exposes a GET /subscriptions
endpoint that returns the list of active subscriptions and their statuses as a JSON document.
Each subscription manifest contains a status part that shows the last publication timestamp,
the publication count, the publication status summary, and the quarantine status:
This endpoint requires a valid token that is allowed to list subscriptions.
{
"apiVersion": "opentestfactory.org/v1alpha1",
"kind": "Subscription",
"metadata": {
"name": "allinone",
"subscription_id": "fa50d95d-98b0-42d5-be56-10f24e1f2736"
},
"spec": {
...
},
"status": {
"lastPublicationTimestamp": "2024-05-30T02:54:22.387308",
"publicationCount": 9,
"publicationStatusSummary": {
"200": 9
},
"quarantine": 0
}
}
GET /workflows/status
The Squash Orchestrator observer service
exposes a GET /workflows/status endpoint
that returns the overall status of the orchestrator as a JSON document.
This endpoint requires a valid token that is allowed to get statuses.
An orchestrator that is currently processing workflows would return something like:
{
"apiVersion": "v1",
"kind": "Status",
"metadata": {},
"message": "1 workflows in progress",
"details": {
"items": ["50c7e5d1-7bdc-46f0-9422-3cc6660d00c0"],
"status": "BUSY"
},
"status": "Success",
"reason": "OK",
"code": 200
}
An idle orchestrator would return something like:
{
"apiVersion": "v1",
"kind": "Status",
"metadata": {},
"message": "No workflow in progress",
"details": {
"items": [],
"status": "IDLE"
},
"status": "Success",
"reason": "OK",
"code": 200
}
Probes
The Squash Orchestrator image includes an opentf-ctl command that can be
used in readiness and liveness probes to help Kubernetes monitor the health of the
container.
startupProbe:
exec:
command:
- opentf-ctl
- get
- subscriptions
failureThreshold: 20
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 3
readinessProbe:
exec:
command:
- opentf-ctl
- get
- subscriptions
failureThreshold: 3
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
Channels
The Squash Orchestrator observer service
exposes a GET /channels endpoint
that returns the known channels (execution environments), the capabilities (tags) they provide, and
their statuses (IDLE, BUSY, PENDING, or UNREACHABLE) as a JSON document.
Miscellaneous
The Squash Orchestrator observer service
exposes a GET /version endpoint
that returns the version of the running components as a JSON document.