upda/README.md

439 lines
39 KiB
Markdown

# README
upda - **Up**date **Da**shboard in Go. Please see [motivation](#motivation) and [concepts](#concepts) what this
application does.
There's also a [upda web interface](https://git.myservermanager.com/varakh/upda-ui). It's recommended to take a look (at
least at the screenshots).
In addition, there's a commandline tool called `upda-cli`. For more information, download it and run `./upda-cli help`
for further instructions. This is especially useful, if you have an `upda` (server) running and like to invoke webhooks
from CLI. `upda-cli` is also bundled in the docker images.
**See the [deployment instructions](./_doc/DEPLOYMENT.md) for examples on how to deploy upda and upda-ui**
The main git repository is hosted at
_[https://git.myservermanager.com/varakh/upda](https://git.myservermanager.com/varakh/upda)_.
Other repositories are mirrors and pull requests, issues, and planning are managed there.
Contributions are very welcome!
* [Motivation](#motivation)
* [Concepts](#concepts)
* [Configuration](#configuration)
* [3rd party integrations](#3rd-party-integrations)
* [Webhooks](#webhooks)
* [Actions](#actions)
* [Prometheus Metrics](#prometheus-metrics)
* [Deployment](#deployment)
* [Native](#native)
* [Docker](#docker)
* [Build docker image](#build-docker-image)
* [Development & contribution](#development--contribution)
* [Getting started](#getting-started)
* [Windows hints](#windows-hints)
* [Release](#release)
## Motivation
> [duin](https://crazymax.dev/diun/) can determine which OCI images have updates
> available. [Argus](https://release-argus.io) can query other sources like GitHub and even invoke actions when an
> update
> has been found, but there's no _convenient_ way of having **one** dashboard or source of truth for all of them across
> different hosts without tinkering with collecting them somewhere in one place. This application is the result of that
> tinkering. :-)
Managing various application or OCI container image updates can be a tedious task:
* A lot of hosts to operate with a lot of different applications being deployed
* A lot of different OCI containers to watch for updated images
* No convenient dashboard to see and manage all the available updates in one place
_upda_ manages a list of updates with attributes attached to it. For new updates to arrive, _upda_ needs to be called
via a webhook call (created within _upda_) from other applications, such as a bash script, an
application like [duin](https://crazymax.dev/diun/) or simply by using the `upda-cli`.
After an update is being tracked, _upda_ provides a convenient way to have everything in one place. In addition, it
exposes managed _updates_ as [prometheus](https://prometheus.io) metrics, so that you can easily build a dashboard
in [Grafana](https://grafana.com), or even attach alerts to pending updates
via [alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/).
In addition, you can use _upda_'s UI to manage updates, e.g. _approve_ them when they have been rolled out to a host.
> _upda_ is **NOT** a scraper to watch docker registries or GitHub releases, it simply tracks and consolidates updates
> from different sources provided via _webhooks_. If you like to watch GitHub releases, write a scraper and
> use `upda-cli` to report back to _upda_.
## Concepts
1. Create a webhook in upda.
2. Use the webhook's URL in a 3rd party application to start tracking an update or use `upda-cli` to report an update.
3. Enjoy visualization and state management of tracked updates in one place.
4. Optionally, define [actions](#actions) for tracked updates as they arrive
_upda_ retrieves new updates when webhooks of upda are invoked, e.g., [duin](https://crazymax.dev/diun/) invokes it or
any other application which can reach the instance. Tracked updates are unique for the
attributes `(application,provider,host)` which means that subsequent updates for an identical _application_, _provider_
and _host_ simply updates the `version` and `metadata` attributes for that tracked _update_ (regardless if the version
or metadata payload _actually_ changed - reasoning behind this is to get reflected metadata updates independent if
version attribute has changed).
State management of tracked updates:
* On first creation, state is set to _pending_.
* When an _update_ is in _approved_ state, an invocation for it resets its state to _pending_.
* _Ignored_ updates are skipped entirely and no attribute is updated.
##### The `application` attribute
The _application_ attribute is an arbitrary identifier, name or label of a subject you like to track,
e.g., `docker.io/varakh/upda` for an OCI image.
##### The `provider` attribute
The _provider_ attribute is an arbitrary name or label. During webhook invocation the provider attribute is derived in
priority:
For the _generic_ webhook:
1. If the incoming payload contains a non-blank `provider` attribute, it's taken from the request.
2. If the incoming payload contains a blank or missing `provider` attribute, the issuing webhook's label is taken.
For the _diun_ webhook:
1. If the issuing webhook's label is blank, then `oci` is used.
2. In any other case, the webhook's label is used.
Because the first priority is the issuing webhook's label, setting the _same_ label for all webhooks results in a
grouping. Also see the _ignore host_ setting for `host` below.
_Remember that changing a webhook's label won't be reflected in already created/tracked updates!_
##### The `host` attribute
_host_ should be set to the originating host name a webhook has been issued from. The _host_
attribute can also be "ignored" (a setting in each webhook). If set to ignored, _upda_ sets _host_ to _global_, thus
update versions can be grouped independent of the originating host. If set for all webhooks, you'll end up with a host
independent update dashboard.
##### The `version` attribute
The _version_ attribute is an arbitrary name or label and subject to change across invocations of webhooks. This can be
a version number, a number of total updates, anything.
##### The `metadata` attribute
An update can hold any additional metadata information provided by request payload `metadata`. Metadata can be inspected
via web interface or API.
## Configuration
The following environment variables can be used to modify application behavior.
| Variable | Purpose | Default/Description |
|:------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------|
| `SECRET` | A 32 character long secure random secret used for encrypting some data inside the database. When data has been created inside the database, the secret cannot be changed anymore, otherwise decryption fails. | Not set by default, you need to explicitly set it, e.g., generate via `openssl rand -hex 16` |
| `TZ` | The time zone (**recommended** to set it properly, background tasks depend on it) | Defaults to `Europe/Berlin`, can be any time zone according to _tz database_ |
| `ADMIN_USER` | Admin user name for login | Not set by default, you need to explicitly set it to user name |
| `ADMIN_PASSWORD` | Admin password for login | Not set by default, you need to explicitly set it to a secure random |
| | | |
| `DB_TYPE` | The database type (Postgres is **recommended**) | Defaults to `sqlite`, possible values are `sqlite` or `postgres` |
| `DB_SQLITE_FILE` | Path to the SQLITE file | Defaults to `<XDG_DATA_DIR>/upda/upda.db`, e.g. `~/.local/share/upda/upda.db` |
| `DB_POSTGRES_HOST` | The postgres host | Postgres host address, defaults to `localhost` |
| `DB_POSTGRES_PORT` | The postgres port | Postgres port, defaults to `5432` |
| `DB_POSTGRES_NAME` | The postgres database name | Postgres database name, needs to be set |
| `DB_POSTGRES_TZ` | The postgres time zone | Postgres time zone settings, defaults to `Europe/Berlin` |
| `DB_POSTGRES_USER` | The postgres user | Postgres user name, needs to be set |
| `DB_POSTGRES_PASSWORD` | The postgres password | Postgres user password, needs to be set |
| | | |
| `SERVER_PORT` | Port | Defaults to `8080` |
| `SERVER_LISTEN` | Server's listen address | Defaults to empty which equals `0.0.0.0` |
| `SERVER_TLS_ENABLED` | If server uses TLS | Defaults `false` |
| `SERVER_TLS_CERT_PATH` | When TLS enabled, provide the certificate path | |
| `SERVER_TLS_KEY_PATH` | When TLS enabled, provide the key path | |
| `SERVER_TIMEOUT` | Timeout the server waits before shutting down to end any pending tasks | Defaults to `1s` (1 second), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| `CORS_ALLOW_ORIGIN` | CORS configuration | Defaults to `*` |
| `CORS_ALLOW_METHODS` | CORS configuration | Defaults to `GET, POST, PUT, PATCH, DELETE, OPTIONS` |
| `CORS_ALLOW_HEADERS` | CORS configuration | Defaults to `Authorization, Content-Type` |
| | | |
| `LOGGING_LEVEL` | Logging level. Possible are `debug`, `info`, `warn`, `error`, `dpanic`, `panic`, `fatal`. Setting to `debug` enables high verbosity output. | Defaults to `info` |
| `LOGGING_ENCODING` | Logging encoding. Possible are `console` and `json` | Defaults to `json` |
| `LOGGING_DIRECTORY` | Logging directory. When set, logs will be added to a file called `upda.log` in addition to the standard output. Ensure that upda has access permissions. Use an external program for log rotation if desired. | |
| | | |
| `WEBHOOKS_TOKEN_LENGTH` | The length of the token | Defaults to `16`, positive number |
| | | |
| `TASK_UPDATE_CLEAN_STALE_ENABLED` | If background task should run to do housekeeping of stale (ignored/approved) updates from the database | Defaults to `false` |
| `TASK_UPDATE_CLEAN_STALE_INTERVAL` | Interval at which a background task does housekeeping by deleting stale (ignored/approved) updates from the database | Defaults to `1h` (1 hour), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| `TASK_UPDATE_CLEAN_STALE_MAX_AGE` | Number defining at which age stale (ignored/approved) updates are deleted by the background task (_updatedAt_ attribute decides) | Defaults to `720h` (168 hours = 1 week), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| | | |
| `TASK_EVENT_CLEAN_STALE_ENABLED` | If background task should run to do housekeeping of stale (old) events from the database | Defaults to `false` |
| `TASK_EVENT_CLEAN_STALE_INTERVAL` | Interval at which a background task does housekeeping by deleting stale (old) events from the database | Defaults to `8h` (8 hours), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| `TASK_EVENT_CLEAN_STALE_MAX_AGE` | Number defining at which age stale (old) events are deleted by the background task (_updatedAt_ attribute decides) | Defaults to `2190h` (2190 hours = 3 months), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| | | |
| `TASK_ACTIONS_ENQUEUE_ENABLED` | If background task should run to enqueue matching actions derived from events (actions are invocation separately after being enqueued) | Defaults to `true` |
| `TASK_ACTIONS_ENQUEUE_INTERVAL` | Interval at which a background task does check to enqueue actions | Defaults to `10s` (10 seconds), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| `TASK_ACTIONS_ENQUEUE_BATCH_SIZE` | Number defining how many unhandled events are processed in a batch by the background task | Defaults to `1`, must be positive number |
| | | |
| `TASK_ACTIONS_INVOKE_ENABLED` | If background task should run to invoke enqueued actions derived | Defaults to `true` |
| `TASK_ACTIONS_INVOKE_INTERVAL` | Interval at which a background task does check to invoke enqueued actions | Defaults to `10s` (10 seconds), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| `TASK_ACTIONS_INVOKE_BATCH_SIZE` | Number defining how many enqueued actions are processed in a batch by the background task | Defaults to `1`, must be positive number |
| `TASK_ACTIONS_INVOKE_MAX_RETRIES` | Number defining how often actions are invoked in case of an error, if exceeded, those actions are not retried again | Defaults to `3`, must be positive number |
| | | |
| `TASK_ACTIONS_CLEAN_STALE_ENABLED` | If background task should run to do housekeeping of stale (handled, meaning success or error state) actions from the database | Defaults to `true` |
| `TASK_ACTIONS_CLEAN_STALE_INTERVAL` | Interval at which a background task does housekeeping by deleting stale (handled) actions from the database | Defaults to `12h` (12 hours), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| `TASK_ACTIONS_CLEAN_STALE_MAX_AGE` | Number defining at which age stale (handled) actions are deleted by the background task (_updatedAt_ attribute decides) | Defaults to `720h` (720 hours = 30 days), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| | | |
| `TASK_PROMETHEUS_REFRESH_INTERVAL` | Interval at which a background task updates custom metrics | Defaults to `60s` (60 seconds), qualifier can be `s = second`, `m = minute`, `h = hour` prefixed with a positive number |
| | | |
| `LOCK_REDIS_ENABLED` | If locking via REDIS (multiple instances) is enabled. Requires REDIS. Otherwise uses in-memory locks. | Defaults to `false` |
| `LOCK_REDIS_URL` | If locking via REDIS is enabled, this should point to a resolvable REDIS instance, e.g. `redis://<user>:<pass>@localhost:6379/<db>`. | |
| | | |
| `PROMETHEUS_ENABLED` | If Prometheus metrics are exposed | Defaults to `false` |
| `PROMETHEUS_METRICS_PATH` | Defines the metrics endpoint path | Defaults to `/metrics` |
| `PROMETHEUS_SECURE_TOKEN_ENABLED` | If Prometheus metrics endpoint is protected by a token when enabled (**recommended**) | Defaults to `true` |
| `PROMETHEUS_SECURE_TOKEN` | The token securing the metrics endpoint when enabled (**recommended**) | Not set by default, you need to explicitly set it to a secure random |
## 3rd party integrations
### Webhooks
This is the core mechanism of _upda_ and why it exists. Webhooks are the central piece of how _upda_ gets notified about
updates.
In order to configure a 3rd party application like [duin](https://crazymax.dev/diun/) to send updates to _upda_ with
the [duin webhook notification configuration](https://crazymax.dev/diun/notif/webhook/), create a new _upda_ webhook via
web interface or via API call.
This gives you
* a unique _upda_ URL to configure in the notification part of [duin](https://crazymax.dev/diun/),
e.g., `/api/v1/webhooks/<a unique identifier>`
* a corresponding token for the URL which must be sent as `X-Webhook-Token` header when calling _upda_'s URL
Expected payload is derived from the _type_ of the webhook which has been created in _upda_.
Example for [duin Webhook notification](https://crazymax.dev/diun/notif/webhook/) `notif`:
```yaml
notif:
webhook:
endpoint: https://upda.domain.tld/api/v1/webhooks/ee03cd9e-04d0-4c7f-9866-efe219c2501e
method: POST
headers:
content-type: application/json
X-Webhook-Token: <the token from webhook creation in upda>
timeout: 10s
```
### Actions
Actions can be used to invoke arbitrary third party tools when an _event_ occurs, e.g., an update has been created or
modified. An action is triggered when its condition meet the action's definition (event name, host, application,
provider).
Actions have types. Different types require different payload to set them up. [shoutrrr](#shoutrrr) is supported as
action type, which can send notifications to a variety of services like Gotify, Ntfy, Teams, OpsGenie and many more.
Supported events are the following:
| Event name | Description |
|:--------------------------------|:--------------------------------------------------------------------|
| `update_created` | An update has been created |
| `update_updated` | An update has been updated (not necessarily its version attribute!) |
| `update_updated_state_pending` | An update's state changed to pending |
| `update_updated_state_approved` | An update's state changed to approved |
| `update_updated_state_ignored` | An update's state changed to ignored |
| `update_deleted` | An update has been removed |
For privacy, an action's configuration supports upda's **secrets** vault, which means that before an action is
triggered, any occurrence of `<SECRET>SECRET_KEY</SECRET>` is properly replaced by the value of the `SECRET_KEY` defined
inside the vault.
In addition to secrets, upda provides **variables** which can be used with the `<VAR>VARIABLE_NAME</VAR>` syntax and any
occurrence is replaced before invocation as well.
| Variable name | Description |
|:-------------------------|:--------------------------------------------------|
| `<VAR>APPLICATION</VAR>` | The update's application name invoking the action |
| `<VAR>PROVIDER</VAR>` | The update's provider name invoking the action |
| `<VAR>HOST</VAR>` | The update's host invoking the action |
| `<VAR>VERSION</VAR>` | The update's version (latest) invoking the action |
#### shoutrrr
[shoutrrr](https://github.com/containrrr/shoutrrr?tab=readme-ov-file#documentation) supports multiple services directly
which can be provided as simple URL, e.g., `gotify://gotify.example.com:443/<token>`, where `<token>`
can also be provided as secret: `gotify://gotify.example.com:443/<SECRET>GOTIFY_TOKEN</SECRET>`.
A full payload for defining an upda shoutrrr action looks like the following. No worries, there's
a [web interface](https://git.myservermanager.com/varakh/upda-ui) for configuring actions:
```json5
{
// ...
"type": "shoutrrr",
"matchEvent": "update_created",
// payload 'urls' and 'body' are specific to the shoutrrr action type
"payload": {
"urls": [
"gotify://myurl/<SECRET>GOTIFY_TOKEN</SECRET>/?title=Great+News+On+Upda"
],
"body": "A new update arrived on <VAR>HOST</VAR> for <VAR>APPLICATION</VAR>. Its version is <VAR>VERSION</VAR>."
}
}
```
### Prometheus Metrics
When `PROMETHEUS_ENABLED` is set to `true`, default metrics about memory utilization, but also custom metrics specific
to _upda_ are exposed under the `PROMETHEUS_METRICS_PATH` endpoint.
A Prometheus scrape configuration might look like the following if `PROMETHEUS_SECURE_TOKEN_ENABLED` is set to `true`.
```shell
scrape_configs:
- job_name: 'upda'
static_configs:
- targets: ['upda:8080']
bearer_token: 'VALUE_OF_PROMETHEUS_SECURE_TOKEN'
```
Custom exposed metrics are exposed under the `upda_` namespace.
Examples:
```shell
# HELP upda_updates_all amount of all updates
upda_updates_all 4
# HELP upda_updates_approved amount of all updates in approved state
upda_updates_approved 2
# HELP upda_updates_ignored amount of all updates in ignored state
upda_updates_ignored 1
# HELP upda_updates_pending amount of all updates in pending state
upda_updates_pending 1
# HELP upda_webhooks amount of all webhooks
upda_webhooks 2
# HELP upda_events amount of all events
upda_events 146
# HELP upda_actions amount of all actions
upda_actions 0
```
There's an example [Grafana](https://grafana.com) dashboard in the `_doc/` folder.
[Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/) could check for the following:
```yaml
- name: update_checks
rules:
- alert: UpdatesAvailable
expr: upda_updates == 0 and upda_updates_pending > 0
for: 4w
labels:
severity: high
class: update
annotations:
summary: "Updates available from upda for {{ $labels.job }}"
description: "Updates available from upda for {{ $labels.job }}"
```
## Deployment
### Native
Use the released binary for your platform or run `make clean build-server-{your-platform}` and the binary will be placed
into the `bin/` folder.
### Docker
For examples how to run, look into [deployment instructions](./_doc/DEPLOYMENT.md) which contains examples
for `docker-compose` files.
#### Build docker image
To build docker images, do the following
```shell
docker build --rm --no-cache -t upda:latest .
```
## Development & contribution
* Ensure to set the following environment variables for proper debug logs during development
```shell
DEVELOPMENT=true
LOGGING_ENCODING=console
LOGGING_LEVEL=debug
```
* Code guidelines
* Each entity has its own repository
* Each entity is only used in repository and service (otherwise, mapping happens)
* Presenter layer is constructed from the entity, e.g., in REST responses and mapped
* No entity is directly returned in any REST response
* All log calls should be handled by `zap.L()`
* Configuration is bootstrapped via separated `struct` types which are given to the service which need them
* Error handling
* Always throw an error with `NewServiceError`
* Always wrap the cause error with `fmt.Errorf`
* Forward/bubble up the error directly, when original error is already a `NewServiceError` (most likely internal
calls)
* Always abort handler chain with `AbortWithError`
* Utils can throw any error
Please look into the `_doc/` folder for [OpenAPI specification](./_doc/api.yaml) and a Postman Collection.
### Getting started
1. Run `make clean dependencies` to fetch dependencies
2. Start `git.myservermanager.com/varakh/upda/cmd/server` (or `cli`) as Go application and ensure to have _required_
environment variables set
If you like to test with PSQL and/or REDIS for task locking, here are some useful docker commands to have containers
up and running quickly. Set necessary environment variables properly.
```shell
# postgres
docker run --rm --name=upda-db \
-p 5432:5432 \
--restart=unless-stopped \
-e POSTGRES_USER=upda \
-e POSTGRES_PASSWORD=upda \
-e POSTGRES_DB=upda \
postgres:16-alpine
# redis
docker run --rm --name some-redis \
-p 6379:6379 \
redis redis-server --save 60 1 --loglevel warning
```
#### Windows hints
On Windows, you need a valid `gcc`, e.g., https://jmeubank.github.io/tdm-gcc/download/ and add the `\bin` folder to your
path.
For any `go` command you run, ensure that your `PATH` has the `gcc` binary and that you add `CGO_ENABLED=1` as
environment.
### Release
Releases are handled by the SCM platform and pipeline. Creating a **new git tag**, creates a new release in the SCM
platform, uploads produced artifacts to that release and publishes docker images automatically.
**Before** doing so, please ensure that the **commit on `master`** has the **correct version settings** and has been
built successfully:
* Adapt `constants_app.go` and change `Version` to the correct version number
* Adapt `CHANGELOG.md` to reflect changes and ensure a date is properly set in the header, also add a reference link
in footer (link to scm git tag source)
* Adapt `api.yaml`: `version` attribute must reflect the to be released version
* Adapt `env: VERSION_*` in `.forgejo/workflows/release.yaml`
After the release has been created, ensure to change the following settings for the _next development cycle_:
* Adapt `constants_app.go` and change `Version` to the _next_ version number
* Adapt `CHANGELOG.md` and add an _UNRELEASED_ section
* Adapt `api.yaml`: `version` attribute must reflect the _next_ version number
* Adapt `env: VERSION_*` in `.forgejo/workflows/release.yaml` to _next_ version number