Building a Log Analyzer CLI as a Zero-Dependency npm Package
Why I Built a Log Analyzer
Production logs are noisy. Thousands of lines of JSONL scroll past, and somewhere in there is a latency spike, an error pattern, or a service that started failing at 3 AM. Most log analysis tools are either cloud-hosted services that require shipping your data somewhere, or heavyweight installations that take longer to set up than the incident takes to resolve.
I wanted a tool I could pipe a file into and get answers. No cloud account, no database, no configuration files. Just a CLI that reads JSONL, computes statistics, detects anomalies, and produces a report. Zero runtime dependencies.
The result is @gagandeep023/log-analyzer: a TypeScript library and CLI that parses JSONL logs, computes latency percentiles and error rates, runs four anomaly detection algorithms, groups errors by pattern similarity, and outputs Markdown or JSON reports.
How It Works
JSONL Input
|
v
[Parser] -- validate fields, collect errors, sort by timestamp
|
v
[Aggregator] -- percentiles, error rates, timeline buckets
|
+--------+--------+--------+
| | | |
v v v v
[Error [Latency [Repeated [Status
Spikes] Outliers] Errors] Anomaly]
| | | |
+--------+--------+--------+
|
v
[Pattern Matcher] -- normalize + group error messages
|
v
[Report Generator] -- Markdown tables or JSON outputThe pipeline is linear. Each stage takes the output of the previous one. The parser produces a sorted array of validated log entries. The aggregator computes statistics over those entries. The anomaly detector runs four independent algorithms. The pattern matcher groups errors by normalized message. The report generator formats everything into a readable document.
Percentile Computation
Percentiles are computed using exact sort-based interpolation. For a given percentile p and a sorted array of n values, the index is (p/100) * (n-1). If that index is not an integer, we interpolate between the two surrounding values. This gives accurate results for any dataset size.
function computePercentiles(values: number[], percentiles: number[]): number[] {
if (values.length === 0) return percentiles.map(() => 0);
const sorted = [...values].sort((a, b) => a - b);
return percentiles.map((p) => {
const index = (p / 100) * (sorted.length - 1);
const lower = Math.floor(index);
const upper = Math.ceil(index);
if (lower === upper) return sorted[lower];
const fraction = index - lower;
return sorted[lower] + fraction * (sorted[upper] - sorted[lower]);
});
}This runs in O(n log n) for the sort, then O(k) for k percentiles. For the log sizes this tool handles (thousands to tens of thousands of entries), that is fast enough. For million-entry files, you would want a streaming approximate algorithm like t-digest, but exact percentiles are more useful for debugging when the data fits in memory.
Anomaly Detection: Four Algorithms
1. Error Spike Detection
The timeline is divided into one-minute buckets. For each bucket, we compute the error rate (errors / total requests). Then we compute the mean and standard deviation of all bucket error rates. Any bucket where the error rate exceeds mean + threshold * stddev is flagged as a spike.
The threshold is configurable (default: 2 standard deviations). A minimum bucket count prevents false positives on short log files. The severity is based on how far the spike exceeds the mean: 5x or more is critical, 3x is high, 2x is medium.
2. Latency Outlier Detection
We compute p99 of all response times globally, then flag any request where responseTime exceeds p99 * multiplier (default: 1.5). Outliers are grouped by endpoint, so the report shows which endpoints are affected and how many requests were impacted. The severity depends on how far the maximum response time exceeds p99.
3. Repeated Error Detection
Error messages are grouped by exact match. Any message appearing more than 3 times is flagged. This catches systematic failures like database timeouts or upstream service errors that repeat across requests. The threshold of 3 avoids noise from one-off errors while catching patterns early.
4. Status Code Anomaly Detection
For each endpoint, we compute the 5xx error rate. If it exceeds the threshold (default: 10%) and there are at least 2 server errors, the endpoint is flagged. This catches cases where a specific route is failing while the rest of the service is healthy. Only 5xx codes are counted since 4xx errors are typically client-side issues.
Pattern Matching
Error messages often contain variable data: user IDs, timestamps, IP addresses, request IDs. The pattern matcher normalizes these before grouping. UUIDs become <UUID>, ISO timestamps become <TIMESTAMP>, emails become <EMAIL>, IP addresses become <IP>, and numeric IDs (4+ digits) become <ID>.
function normalizeMessage(message: string): string {
return message
.replace(UUID_REGEX, '<UUID>')
.replace(ISO_TIMESTAMP_REGEX, '<TIMESTAMP>')
.replace(EMAIL_REGEX, '<EMAIL>')
.replace(IP_REGEX, '<IP>')
.replace(HEX_ID_REGEX, '<HEX_ID>')
.replace(NUMERIC_ID_REGEX, '<ID>')
.trim();
}After normalization, messages like "User a1b2c3d4-... not found" and "User e5f6a7b8-... not found" collapse into a single pattern: "User <UUID> not found". The output includes the original examples so you can see the actual values.
The CLI: Manual Argument Parsing
The CLI has no dependencies. Not even a flag parser. Arguments are parsed manually from process.argv. The tool supports two commands: analyze (full pipeline) and parse (validation only). Options include output format (json, markdown, or both), output directory, and a config file for tuning thresholds.
# Basic analysis
log-analyzer analyze access.jsonl
# JSON output to a directory
log-analyzer analyze access.jsonl --format json --output ./reports
# Both formats with custom thresholds
log-analyzer analyze access.jsonl --format both --config thresholds.json
# Validate a log file without analysis
log-analyzer parse access.jsonlI considered using yargs or commander for argument parsing. But those are runtime dependencies, and the constraint was zero. The manual parser handles positional arguments, named options with values, and help text. It covers the use cases without adding 50KB to the package.
Tradeoffs and Lessons
- Sort-based percentiles are simple and accurate but require O(n) memory. For very large files, a streaming algorithm would be better. The tradeoff is worth it for the target use case (files under 100MB).
- Standard deviation for spike detection assumes roughly normal distribution of error rates. Real traffic is often bursty. The configurable threshold lets users adjust sensitivity.
- Pattern matching is regex-based, not semantic. Two errors with different wording but the same root cause will not be grouped together. A more sophisticated approach would use edit distance or embeddings, but that adds complexity and dependencies.
- The CLI reads the entire file into memory. For files larger than available RAM, you would need streaming line-by-line parsing. Again, the target use case is debugging-scale files, not production log aggregation.
- Zero dependencies means reimplementing things that libraries handle well. The tradeoff is a 33KB package that installs in under a second and has no supply chain risk.
Try It
npm i @gagandeep023/log-analyzerUse it as a CLI tool with npx, or import the analyze function in your own code. The library exports every individual module (parser, aggregator, anomaly detector, pattern matcher, report generator) so you can compose your own pipeline.
import { analyze } from '@gagandeep023/log-analyzer';
const content = fs.readFileSync('access.jsonl', 'utf-8');
const result = analyze(content, {
bucketSizeMs: 60000,
errorRateThreshold: 2,
latencyOutlierMultiplier: 1.5,
});
console.log(result.summary);
console.log(`Found ${result.anomalies.length} anomalies`);The interactive demo on this site runs the analysis on a sample log file with injected anomalies so you can see the output without installing anything.
Get more posts like this
I write about system design, backend engineering, and building npm packages from scratch. Follow along on Substack for new posts.
Subscribe on Substack →