Getting Started
DetectMate enables the creation of log analysis pipelines to analyze log data streams and detect violations or anomalies. It can be run from the console or embedded in Python programs as a library. Designed to operate analyses with limited resources and the lowest possible permissions, DetectMate is suitable for use on production servers. In practice, log analysis involves distinct steps that are central to its operation.
Logfile analysis consists of two main steps: first, parsing log lines, and second, detecting anomalies within those parsed lines. In modern systems, multiple applications process logs at various stages, creating a flow from raw log ingestion to final anomaly detection, and most likely even across different network nodes. This requires a highly configurable system to maintain the flexibility to create suitable log pipelines. DetectMate, therefore, uses a microservice architecture that allows connecting all components together as needed.
The following diagram illustrates a typical log analysis pipeline:

Logfile ingestion is handled by Fluentd. It supports reading from various systems and can convert the data to a specific format before sending it to any other target. In our example, it reads lines from a file and sends them to the DetectMate parser. The parser processes the data and forwards them to the detector. If the detector finds an anomaly, it will send it to another Fluentd process that can communicate with various targets, such as Elasticsearch, Kafka, or a log file. To make configuring such a pipeline straightforward, the DetectmateService repository ships a boilerplate Docker Compose file. This tutorial will use the Docker Compose file so that we can focus on the anomaly detection only.
The Objective
In this tutorial, we will set up a log data analysis pipeline that reads Nginx access logs. We will then train the detector on various paths in the HTTP requests. Finally, we will generate anomalies by sending HTTP requests to different paths of the trained model.
Preparation
We will setup the DetectMate on a fresh installation of Ubuntu Noble:
alice@ubuntu2404:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.4 LTS
Release: 24.04
Codename: noble
In this tutorial we want to find anomalies in Nginx access.logs. So let's install nginx:
alice@ubuntu2404:~$ sudo apt update && sudo apt install nginx -y
Hit:1 http://at.archive.ubuntu.com/ubuntu noble InRelease
Hit:2 http://at.archive.ubuntu.com/ubuntu noble-updates InRelease
Hit:3 http://at.archive.ubuntu.com/ubuntu noble-backports InRelease
Hit:4 http://security.ubuntu.com/ubuntu noble-security InRelease
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
9 packages can be upgraded. Run 'apt list --upgradable' to see them.
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
nginx-common
Suggested packages:
fcgiwrap nginx-doc ssl-cert
The following NEW packages will be installed:
nginx nginx-common
0 upgraded, 2 newly installed, 0 to remove and 9 not upgraded.
Need to get 565 kB of archives.
After this operation, 1,596 kB of additional disk space will be used.
Get:1 http://at.archive.ubuntu.com/ubuntu noble-updates/main amd64 nginx-common all 1.24.0-2ubuntu7.6 [43.5 kB]
Get:2 http://at.archive.ubuntu.com/ubuntu noble-updates/main amd64 nginx amd64 1.24.0-2ubuntu7.6 [521 kB]
Fetched 565 kB in 0s (2,816 kB/s)
Preconfiguring packages ...
Selecting previously unselected package nginx-common.
(Reading database ... 87543 files and directories currently installed.)
Preparing to unpack .../nginx-common_1.24.0-2ubuntu7.6_all.deb ...
Unpacking nginx-common (1.24.0-2ubuntu7.6) ...
Selecting previously unselected package nginx.
Preparing to unpack .../nginx_1.24.0-2ubuntu7.6_amd64.deb ...
Unpacking nginx (1.24.0-2ubuntu7.6) ...
Setting up nginx-common (1.24.0-2ubuntu7.6) ...
Created symlink /etc/systemd/system/multi-user.target.wants/nginx.service → /usr/lib/systemd/system/nginx.service.
Setting up nginx (1.24.0-2ubuntu7.6) ...
* Upgrading binary nginx [ OK ]
Processing triggers for man-db (2.12.0-4build2) ...
Processing triggers for ufw (0.36.2-6) ...
Scanning processes...
Scanning linux images...
Running kernel seems to be up-to-date.
No services need to be restarted.
No containers need to be restarted.
No user sessions are running outdated binaries.
No VM guests are running outdated hypervisor (qemu) binaries on this host.
alice@ubuntu2404:~$
We can try to send HTTP-requests to our local Nginx:
alice@ubuntu2404:~$ curl http://localhost
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
alice@ubuntu2404:~$
Now we should have at least one line in /var/log/nginx/access.log:
alice@ubuntu2404:~$ sudo cat /var/log/nginx/access.log
::1 - - [18/Mar/2026:11:43:30 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/8.5.0"
Since we have a working webserver, we can now move on to deploy the logdata anomaly pipeline.
Deploying the Pipeline
We need Docker and Docker Compose for the deployment. A comprehensive tutorial about how to install Docker can be found at https://docs.docker.com/engine/install/ubuntu/
In this section, we will focus solely on the installation commands.
First run the following command to uninstall all conflicting packages:
sudo apt remove $(dpkg --get-selections docker.io docker-compose docker-compose-v2 docker-doc podman-docker containerd runc | cut -f1)
Now set up Docker's apt repository:
# Add Docker's official GPG key:
sudo apt update
sudo apt install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
sudo tee /etc/apt/sources.list.d/docker.sources <<EOF
Types: deb
URIs: https://download.docker.com/linux/ubuntu
Suites: $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}")
Components: stable
Signed-By: /etc/apt/keyrings/docker.asc
EOF
sudo apt update
Install docker and docker compose:
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Note
Since we did not add our user to the docker group, we have to use sudo for docker compose!
With Docker compose working, we will now download the DetectmateService repository using git:
alice@ubuntu2404:~$ git clone https://github.com/ait-detectmate/DetectMateService.git
Cloning into 'DetectMateService'...
remote: Enumerating objects: 2303, done.
remote: Counting objects: 100% (372/372), done.
remote: Compressing objects: 100% (227/227), done.
remote: Total 2303 (delta 171), reused 171 (delta 131), pack-reused 1931 (from 3)
Receiving objects: 100% (2303/2303), 3.97 MiB | 9.12 MiB/s, done.
Resolving deltas: 100% (1229/1229), done.
alice@ubuntu2404:~$ cd DetectMateService/
Let's start the default pipeline, just to test it:
alice@ubuntu2404:~/DetectMateService$ sudo docker compose up -d
[+] up 6/6
✔ Container detectmateservice-fluentout-1 Started
✔ Container prometheus Started s
✔ Container grafana Started s
✔ Container detectmateservice-detector-1 Started s
✔ Container detectmateservice-parser-1 Started
✔ Container detectmateservice-fluentin-1 Started
alice@ubuntu2404:~/DetectMateService$
To check the status of the containers, we can use docker compose ps:
alice@ubuntu2404:~/DetectMateService$ sudo docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
detectmateservice-detector-1 detectmateservice-detector "uv run detectmate -…" detector 4 minutes ago Up 3 minutes 0.0.0.0:8002->8000/tcp, [::]:8002->8000/tcp
detectmateservice-fluentin-1 detectmateservice-fluentin "tini -- /bin/entryp…" fluentin 4 minutes ago Up 3 minutes 5140/tcp, 24224/tcp
detectmateservice-fluentout-1 detectmateservice-fluentout "tini -- /bin/entryp…" fluentout 4 minutes ago Up 3 minutes 5140/tcp, 24224/tcp
detectmateservice-parser-1 detectmateservice-parser "uv run detectmate -…" parser 4 minutes ago Up 3 minutes 0.0.0.0:8001->8000/tcp, [::]:8001->8000/tcp
grafana grafana/grafana:latest "/run.sh" grafana 4 minutes ago Up 3 minutes 0.0.0.0:3000->3000/tcp, [::]:3000->3000/tcp
prometheus prom/prometheus:latest "/bin/prometheus --c…" prometheus 4 minutes ago Up 3 minutes 9090/tcp
For now, we will shutdown all containers:
alice@ubuntu2404:~/DetectMateService$ sudo docker compose down -v
[+] down 9/9
✔ Container grafana Removed
✔ Container detectmateservice-fluentin-1 Removed
✔ Container prometheus Removed
✔ Container detectmateservice-parser-1 Removed
✔ Container detectmateservice-detector-1 Removed
✔ Container detectmateservice-fluentout-1 Removed
✔ Volume detectmateservice_grafana_data Removed
✔ Network detectmateservice_default Removed
✔ Volume detectmateservice_prometheus_data Removed
We have finally all requirements installed and have a boilerplate template for docker compose that starts an initial pipeline. In the next sections we will reconfigure that
pipeline so that we can read the access.log and generate anomalies.
Mount the access.log
The preconfigured pipeline reads logs from container/fluentlogs/some.log. In order to be able to read the nginx access.log file, we need to mount /var/log/nginx into the fluentin container
and modify the fluentd config so that it reads access.log instead.
Initially we edit the docker-compose.yml and change only the line 11 to use /var/log/nginx:
# version: "3"
services:
fluentin:
#image: dm-fluentd:latest
build:
context: .
dockerfile: container/Dockerfile_fluentd
volumes:
- '$PWD/container/fluentin:/fluentd/etc'
- '/var/log/nginx:/fluentd/log'
- '$PWD/container/run:/run'
depends_on:
- parser
parser:
# image: detectmate:dev-0.1.6
build: .
volumes:
- '$PWD/container/config:/config'
- '$PWD/container/logs:/logs'
- '$PWD/container/run:/run'
command: uv run detectmate --settings /config/parser_settings.yaml --config /config/parser_config.yaml
ports:
- "8001:8000"
depends_on:
- detector
detector:
# image: detectmate:dev-0.1.6
build: .
volumes:
- '$PWD/container/config:/config'
- '$PWD/container/logs:/logs'
- '$PWD/container/run:/run'
command: uv run detectmate --settings /config/detector_settings.yaml --config /config/detector_config.yaml
ports:
- "8002:8000"
depends_on:
- fluentout
fluentout:
# image: dm-fluentd:latest
build:
context: .
dockerfile: container/Dockerfile_fluentd
volumes:
- '$PWD/container/fluentout:/fluentd/etc'
- '$PWD/container/fluentlogs:/fluentd/log'
- '$PWD/container/run:/run'
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
volumes:
- ./container/prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
expose:
- 9090
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
depends_on:
- prometheus
volumes:
- ./container/grafana/prometheus.yml:/etc/grafana/provisioning/datasources/prometheus.yml
- grafana_data:/var/lib/grafana
# kafka:
# image: apache/kafka-native
# ports:
# - "9092:9092"
# environment:
# KAFKA_LISTENERS: CONTROLLER://localhost:9091,HOST://0.0.0.0:9092,DOCKER://0.0.0.0:9093
# KAFKA_ADVERTISED_LISTENERS: DOCKER://kafka:9093,HOST://kafka:9092
# KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: CONTROLLER:PLAINTEXT,DOCKER:PLAINTEXT,HOST:PLAINTEXT
#
# # Settings required for KRaft mode
# KAFKA_NODE_ID: 1
# KAFKA_PROCESS_ROLES: broker,controller
# KAFKA_CONTROLLER_LISTENER_NAMES: CONTROLLER
# KAFKA_CONTROLLER_QUORUM_VOTERS: 1@localhost:9091
#
# # Listener to use for broker-to-broker communication
# KAFKA_INTER_BROKER_LISTENER_NAME: DOCKER
#
# # Required for a single node cluster
# KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
volumes:
prometheus_data:
driver: local
grafana_data:
driver: local
Now that the access.logs are available in the container, we have to point fluentd to read that file. We need to edit the file container/fluentin/fluent.conf and replace path /fluentd/log/some.log with path /fluentd/log/access.log:
<source>
@type tail
@id input_tail
<parse>
@type none
</parse>
path /fluentd/log/access.log
path_key logSource
tag nng.*
</source>
<match nng.**>
@type nng
uri ipc:///run/parser.engine.ipc
<inject>
hostname_key hostname
# overwrite hostname:
# hostname somehost
</inject>
<buffer>
flush_mode immediate
</buffer>
<format>
@type detectmate
</format>
</match>
The Nginx access.log will be mounted into the fluentin container and fluentd is using the correct file. We can finally look into the DetectMate config and
generate anomalies.
DetectMate Config
The log pipeline uses two DetectMate services, parser and detector. The parser splits the log line into meaningful tokens, which the detector then uses to identify anomalies. We need to configure the parser and detector. Since the detector needs to know which tokens it receives from the parser so it can look for anomalies, the two configurations are closely related.
Parser
The Nginx log line we previously created has a very specific format:
::1 - - [18/Mar/2026:11:43:30 +0000] "GET / HTTP/1.1" 200 615 "-" "curl/8.5.0"
This format can be described like that:
<IP> - - [<Time>] "<Method> <URL> <Protocol>" <Status> <Bytes> "<Referer>" "<UserAgent>"
DetectMate includes a matcher_parser that can split such a log line. The configuration for
the parser is in container/config/parser_config.yaml:
parsers:
MatcherParser:
method_type: matcher_parser
auto_config: false
log_format: '<IP> - - [<Time>] "<Method> <URL> <Protocol>" <Status> <Bytes> "<Referer>" "<UserAgent>"'
time_format: null
params:
remove_spaces: false
remove_punctuation: false
lowercase: false
path_templates: /config/templates.txt # empty file because there are no templates necessary for apache access logs
We don't need to modify that configuration, since it is compatible with the nginx access.log format and we can now continue with the configuration of the detector.
Detector
The simplest way to generate anomalies is to watch a single field of the parsed data and learn all its values during training. As soon as the detector switches from training mode to detection mode, all values not found in the trained model are flagged as anomalies. DetectMate ships with a new_value_detector that can do exactly that. The config container/config/detector_config.yaml looks as follows:
detectors:
NewValueDetector:
method_type: new_value_detector
data_use_training: 2
auto_config: false
global: # define global instance for new_value_detector similar to "events"
global_instance: # define instance name
header_variables: # another level to have the same structure as "events"
- pos: URL
Here, the URL token from the parsed data is monitored (- pos: URL), and the first two log lines are used for training (data_use_training: 2). Any subsequent log lines will be evaluated for anomalies and compared against the values seen during training on the first two log lines.
Now let's start the pipeline using sudo docker compose up -d and send two valid log lines with two different status values:
alice@ubuntu2404:~/DetectMateService$ sudo docker compose up -d
[+] up 7/7
✔ Network detectmateservice_default Created
✔ Container prometheus Started
✔ Container detectmateservice-fluentout-1 Started
✔ Container detectmateservice-detector-1 Started
✔ Container grafana Started
✔ Container detectmateservice-parser-1 Started
✔ Container detectmateservice-fluentin-1 Started
alice@ubuntu2404:~/DetectMateService$ sudo docker compose ps
NAME IMAGE COMMAND SERVICE CREATED STATUS PORTS
detectmateservice-detector-1 detectmateservice-detector "uv run detectmate -…" detector 7 seconds ago Up 5 seconds 0.0.0.0:8002->8000/tcp, [::]:8002->8000/tcp
detectmateservice-fluentin-1 detectmateservice-fluentin "tini -- /bin/entryp…" fluentin 7 seconds ago Up 4 seconds 5140/tcp, 24224/tcp
detectmateservice-fluentout-1 detectmateservice-fluentout "tini -- /bin/entryp…" fluentout 8 seconds ago Up 6 seconds 5140/tcp, 24224/tcp
detectmateservice-parser-1 detectmateservice-parser "uv run detectmate -…" parser 7 seconds ago Up 5 seconds 0.0.0.0:8001->8000/tcp, [::]:8001->8000/tcp
grafana grafana/grafana:latest "/run.sh" grafana 7 seconds ago Up 5 seconds 0.0.0.0:3000->3000/tcp, [::]:3000->3000/tcp
prometheus prom/prometheus:latest "/bin/prometheus --c…" prometheus 8 seconds ago Up 6 seconds 9090/tcp
alice@ubuntu2404:~/DetectMateService$
Wait a couple of minutes until parser and detector containers are up and running. You can check by executing sudo docker compose logs parser or sudo docker compose logs detector.
When the containers are ready, the output of the component will show Uvicorn running on or any HTTP-requests for the /metrics endpoint:
parser-1 | [2026-03-18 15:21:45,017] INFO detectmatelibrary.parsers.json_parser.MatcherParser.b7ce95e085705d4d87b71db2d1392f08: setup_io: ready to process messages
parser-1 | [2026-03-18 15:21:45,017] INFO detectmatelibrary.parsers.json_parser.MatcherParser.b7ce95e085705d4d87b71db2d1392f08: HTTP Admin active at 0.0.0.0:8000
parser-1 | [2026-03-18 15:21:45,018] INFO detectmatelibrary.parsers.json_parser.MatcherParser.b7ce95e085705d4d87b71db2d1392f08: Auto-starting engine...
parser-1 | [2026-03-18 15:21:45,018] INFO detectmatelibrary.parsers.json_parser.MatcherParser.b7ce95e085705d4d87b71db2d1392f08: engine started
parser-1 | INFO: Started server process [61]
parser-1 | INFO: Waiting for application startup.
parser-1 | INFO: Application startup complete.
parser-1 | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
parser-1 | INFO: 172.18.0.2:43378 - "GET /metrics HTTP/1.1" 200 OK
parser-1 | INFO: 172.18.0.2:39840 - "GET /metrics HTTP/1.1" 200 OK
Now generate two access.log lines:
alice@ubuntu2404:~/DetectMateService$ curl http://localhost/hello
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.24.0 (Ubuntu)</center>
</body>
</html>
alice@ubuntu2404:~/DetectMateService$ curl http://localhost/world
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.24.0 (Ubuntu)</center>
</body>
</html>
alice@ubuntu2404:~/DetectMateService$
We now trained with the two values hello and world. This means, as soon as we query any other url than /hello or /world we should receive an anomaly. Anomalies get logged in container/fluentlogs/output.%Y%m%d. With ls container/fluentlogs/output.%Y%m%d find the filename buffer.<id>.log and have a look:
alice@ubuntu2404:~/DetectMateService$ curl http://localhost/foobar
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx/1.24.0 (Ubuntu)</center>
</body>
</html>
alice@ubuntu2404:~/DetectMateService$ sudo cat container/fluentlogs/output.%Y%m%d/buffer.q64d4e42c345866e56bc160786171b408.log
2026-03-18T15:39:43+00:00 nng.input {"__version__":"1.0.0","detectorID":"NewValueDetector","detectorType":"new_value_detector","alertID":"10","detectionTimestamp":1773848383,"logIDs":["e5d922c8-19e1-47d1-842b-7bbabecb384d"],"score":1.0,"extractedTimestamps":[1773848383],"description":"NewValueDetector detects values not encountered in training as anomalies.","receivedTimestamp":1773848383,"alertsObtain":{"Global - URL":"Unknown value: '/foobar'"}}
alice@ubuntu2404:~/DetectMateService$
Great! We detected our first anomaly.
This was a very basic example, but it shows how to easily deploy a full log data anomaly pipeline, including two DetectMate services, using a parser for the Nginx access log format, and how this is then used to flag an anomaly.