LogBatcher Parser
LLM-based log parser that infers event templates from raw log messages using any OpenAI-compatible model. No training data or labeled examples are required.
| Schema | Description | |
|---|---|---|
| Input | LogSchema | Raw log string |
| Output | ParserSchema | Structured log with template and variables |
Overview
LogBatcherParser wraps the LogBatcher engine (MIT, LogIntelligence 2024) as a CoreParser. Parsing proceeds in two phases:
- Cache lookup — the incoming log is matched against previously seen templates using a hash-based exact match followed by a tree-based similarity check. If a match is found, no LLM call is made.
- LLM query — on a cache miss, the log is submitted to the configured model. The returned template is stored in the cache for future reuse.
Variable slots in templates use the <*> wildcard notation (e.g. User <*> logged in from <*>). Extracted variables are written to output_["variables"] in order of appearance.
Configuration
| Field | Type | Default | Description |
|---|---|---|---|
method_type |
string | "logbatcher_parser" |
Parser type identifier |
model |
string | "gpt-4o-mini" |
Model name passed to the OpenAI-compatible endpoint |
api_key |
string | "" |
API key for the chosen provider |
base_url |
string | "" |
Base URL of the OpenAI-compatible endpoint. Leave empty to use the default OpenAI endpoint |
batch_size |
int | 10 |
Maximum number of logs submitted per LLM call |
Example YAML fragment (OpenAI):
parsers:
LogBatcherParser:
method_type: logbatcher_parser
params:
model: "gpt-4o-mini"
api_key: "<YOUR_API_KEY>"
batch_size: 10
Example YAML fragment (local Ollama):
parsers:
LogBatcherParser:
method_type: logbatcher_parser
params:
model: "llama3"
api_key: "ollama"
base_url: "http://localhost:11434/v1"
batch_size: 10
Usage examples
Basic usage — parse a raw log and read the inferred template:
from detectmatelibrary.parsers.logbatcher import LogBatcherParser, LogBatcherParserConfig
import detectmatelibrary.schemas as schemas
config = LogBatcherParserConfig(
api_key="<YOUR_API_KEY>",
model="gpt-4o-mini",
batch_size=10,
)
parser = LogBatcherParser(name="LogBatcherParser", config=config)
input_log = schemas.LogSchema({
"logID": "1",
"log": "User admin logged in from 192.168.1.10",
})
output = schemas.ParserSchema()
parser.parse(input_log, output)
print(output["template"]) # e.g. "User <*> logged in from <*>"
print(output["variables"]) # e.g. ["admin", "192.168.1.10"]
print(output["EventID"]) # integer index assigned by the cache
Using a local Ollama instance:
config = LogBatcherParserConfig(
api_key="ollama",
model="llama3",
base_url="http://localhost:11434/v1",
batch_size=10,
)
parser = LogBatcherParser(name="LogBatcherParser", config=config)
Passing config as a dict:
parser = LogBatcherParser(config={
"method_type": "logbatcher_parser",
"api_key": "<YOUR_API_KEY>",
"model": "gpt-4o-mini",
"batch_size": 10,
})
Go back to Index