Debugging Guide

This guide provides systematic approaches to debugging issues in the DataSuite ETL pipeline, from identifying problems to implementing solutions.

Debugging Methodology

1. Problem Identification

  • Define the issue: What exactly is not working?

  • Determine scope: Which services/components are affected?

  • Identify symptoms: Error messages, performance issues, data problems

  • Establish timeline: When did the issue start?

2. Systematic Investigation

  1. Check service health - Are all containers running?

  2. Review logs - What do the error messages indicate?

  3. Test connectivity - Can services communicate?

  4. Validate data flow - Is data moving through the pipeline?

  5. Monitor resources - Are there resource constraints?

Service-Level Debugging

MySQL Debugging

Check Service Status:

# Container status
docker ps | grep mysql

# Process status inside container
docker exec mysql ps aux

# Port binding
docker port mysql

# Resource usage
docker stats mysql --no-stream

Database-Level Debugging:

# Connection test
docker exec mysql mysqladmin ping -uroot -ppassword

# Check error log
docker exec mysql cat /var/log/mysql/error.log

# Active connections
docker exec mysql mysql -uroot -ppassword -e "SHOW PROCESSLIST;"

# Database size and tables
docker exec mysql mysql -uroot -ppassword adventureworks -e "
SELECT 
    table_name,
    ROUND(((data_length + index_length) / 1024 / 1024), 2) AS 'DB Size in MB'
FROM information_schema.tables 
WHERE table_schema='adventureworks';"

ClickHouse Debugging

Service Health:

# Basic connectivity
curl -v http://localhost:8123/ping

# Authentication test
curl -u admin:clickhouse123 http://localhost:8123/ -d "SELECT version()"

# System information
curl -u admin:clickhouse123 http://localhost:8123/ -d "
SELECT 
    uptime() as uptime_seconds,
    formatReadableSize(total_memory_tracker) as memory_usage,
    formatReadableQuantity(ProfileEvent_Query) as total_queries
FORMAT PrettyCompact"

Query Performance Analysis:

-- Slow queries
SELECT 
    query,
    query_duration_ms,
    read_rows,
    read_bytes,
    memory_usage,
    user
FROM system.query_log 
WHERE query_duration_ms > 1000 
    AND type = 'QueryFinish'
ORDER BY query_duration_ms DESC 
LIMIT 10;

-- Current running queries
SELECT 
    query_id,
    user,
    query,
    elapsed,
    read_rows,
    total_rows_approx
FROM system.processes;

LogStash Debugging

Pipeline Status:

# Node stats
curl -s http://localhost:9600/_node/stats | jq

# Pipeline-specific stats
curl -s http://localhost:9600/_node/stats/pipelines | jq

# Hot threads (performance issues)
curl -s http://localhost:9600/_node/hot_threads

Configuration Debugging:

# Test configuration syntax
docker exec logstash /usr/share/logstash/bin/logstash --config.test_and_exit

# View loaded configuration
curl -s http://localhost:9600/_node/pipelines | jq

# Check plugin versions
docker exec logstash /usr/share/logstash/bin/logstash-plugin list --verbose

Data Flow Debugging

Trace Data Movement

Step 1: Source Data Verification

# Count records in MySQL
docker exec mysql mysql -uroot -ppassword adventureworks -e "
SELECT 
    'Sales.SalesOrderHeader' as table_name,
    COUNT(*) as record_count,
    MIN(ModifiedDate) as earliest_modified,
    MAX(ModifiedDate) as latest_modified
FROM Sales.SalesOrderHeader;"

Step 2: Ingestion Verification

# Check LogStash last run metadata
docker exec logstash cat /usr/share/logstash/.logstash_jdbc_last_run

# Monitor LogStash output in real-time
docker logs logstash -f | grep -E "(INSERT INTO|ERROR|WARN)"

Step 3: Destination Verification

# Count records in ClickHouse
curl -u admin:clickhouse123 http://localhost:8123/ -d "
SELECT 
    'bronze_layer.sales_orders' as table_name,
    COUNT(*) as record_count,
    MIN(_ingested_at) as earliest_ingested,
    MAX(_ingested_at) as latest_ingested
FROM bronze_layer.sales_orders
FORMAT PrettyCompact"

Data Quality Debugging

Missing Records Investigation:

# Compare record counts
MYSQL_COUNT=$(docker exec mysql mysql -uroot -ppassword adventureworks -e "SELECT COUNT(*) FROM Sales.SalesOrderHeader;" -s -N)
CH_COUNT=$(curl -s -u admin:clickhouse123 http://localhost:8123/ -d "SELECT COUNT(*) FROM bronze_layer.sales_orders FORMAT TabSeparated")

echo "MySQL: $MYSQL_COUNT records"
echo "ClickHouse: $CH_COUNT records"
echo "Difference: $((MYSQL_COUNT - CH_COUNT)) records"

Duplicate Detection:

-- Find duplicates in ClickHouse
SELECT 
    salesorderid,
    COUNT(*) as occurrences,
    MIN(_ingested_at) as first_ingested,
    MAX(_ingested_at) as last_ingested
FROM bronze_layer.sales_orders 
GROUP BY salesorderid 
HAVING COUNT(*) > 1 
ORDER BY occurrences DESC;

Performance Debugging

Resource Monitoring

Container Resource Usage:

# Real-time monitoring
docker stats

# Historical resource usage
docker exec mysql cat /proc/meminfo | grep MemAvailable
docker exec clickhouse cat /proc/loadavg

Database Performance Metrics:

-- ClickHouse performance metrics
SELECT 
    event,
    value,
    description
FROM system.events 
WHERE event LIKE '%Query%' 
    OR event LIKE '%Memory%'
    OR event LIKE '%Disk%'
ORDER BY value DESC;

Query Performance Analysis

Slow Query Analysis:

-- Analyze query patterns
SELECT 
    query_kind,
    count() as query_count,
    avg(query_duration_ms) as avg_duration,
    max(query_duration_ms) as max_duration,
    avg(memory_usage) as avg_memory
FROM system.query_log 
WHERE event_date >= today() - 1
GROUP BY query_kind
ORDER BY avg_duration DESC;

Network Debugging

Container Connectivity

Test Network Connectivity:

# Test from LogStash to MySQL
docker exec logstash nc -zv mysql 3306

# Test from LogStash to ClickHouse
docker exec logstash nc -zv clickhouse 8123

# DNS resolution test
docker exec logstash nslookup mysql
docker exec logstash nslookup clickhouse

Network Configuration:

# Inspect Docker network
docker network inspect datasuite-network

# Check container network settings
docker inspect mysql | jq '.[0].NetworkSettings'
docker inspect clickhouse | jq '.[0].NetworkSettings'

Log Analysis Techniques

Structured Log Analysis

LogStash Log Patterns:

# Error patterns
docker logs logstash 2>&1 | grep -E "(ERROR|FATAL|Exception)"

# Pipeline execution patterns  
docker logs logstash 2>&1 | grep -E "(Pipeline started|Pipeline stopped|SELECT|INSERT)"

# Performance patterns
docker logs logstash 2>&1 | grep -E "(duration|took|seconds)"

ClickHouse Log Analysis:

# Error log analysis
docker exec clickhouse grep -E "(ERROR|Exception)" /var/log/clickhouse-server/clickhouse-server.log

# Query log analysis
docker exec clickhouse grep -E "(executeQuery|QueryFinish)" /var/log/clickhouse-server/clickhouse-server.log

Advanced Debugging Tools

Enable Debug Logging

LogStash Debug Mode:

# Add to logstash.conf
input {
  jdbc {
    # ... existing configuration
    sql_log_level => "debug"
  }
}

filter {
  if [debug] {
    stdout { codec => rubydebug }
  }
}

ClickHouse Debug Queries:

-- Enable query logging
SET send_logs_level = 'debug';

-- Profile query execution
SET query_profiler_real_time_period_ns = 100000000;
SET query_profiler_cpu_time_period_ns = 100000000;

Creating Debug Scripts

Create debug-pipeline.sh:

#!/bin/bash

echo "=== DataSuite ETL Debug Report ==="
echo "Generated: $(date)"
echo

echo "=== Container Status ==="
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
echo

echo "=== Resource Usage ==="
docker stats --no-stream --format "table {{.Container}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.NetIO}}"
echo

echo "=== Data Counts ==="
echo -n "MySQL Sales Orders: "
docker exec mysql mysql -uroot -ppassword adventureworks -e "SELECT COUNT(*) FROM Sales.SalesOrderHeader;" -s -N 2>/dev/null || echo "ERROR"

echo -n "ClickHouse Bronze: "
curl -s -u admin:clickhouse123 http://localhost:8123/ -d "SELECT COUNT(*) FROM bronze_layer.sales_orders FORMAT TabSeparated" 2>/dev/null || echo "ERROR"

echo

echo "=== Recent Errors ==="
echo "MySQL Errors:"
docker logs mysql --tail 5 2>&1 | grep -i error || echo "No recent errors"

echo "ClickHouse Errors:"
docker logs clickhouse --tail 5 2>&1 | grep -i error || echo "No recent errors"

echo "LogStash Errors:"
docker logs logstash --tail 5 2>&1 | grep -i error || echo "No recent errors"

echo
echo "=== Pipeline Health ==="
if curl -s http://localhost:9600/_node/stats >/dev/null 2>&1; then
    echo "LogStash API: ✅ Accessible"
else
    echo "LogStash API: ❌ Not accessible"
fi

if curl -s -u admin:clickhouse123 http://localhost:8123/ping | grep -q "Ok"; then
    echo "ClickHouse API: ✅ Accessible"
else
    echo "ClickHouse API: ❌ Not accessible"
fi

echo "=== End Debug Report ==="

Debugging Best Practices

1. Systematic Approach

  • Start with the most recent changes

  • Work backwards through the pipeline

  • Test one component at a time

  • Document findings and solutions

2. Information Gathering

  • Collect logs from all affected services

  • Note exact error messages and timestamps

  • Capture system resource usage

  • Document steps to reproduce the issue

3. Hypothesis Testing

  • Form specific hypotheses about the cause

  • Test each hypothesis systematically

  • Make minimal changes to isolate variables

  • Verify fixes don't introduce new issues

4. Prevention

  • Implement comprehensive monitoring

  • Set up proactive alerting

  • Maintain detailed documentation

  • Regularly review and update configurations

Common Debugging Scenarios

Scenario 1: Data Not Flowing

  1. Check LogStash pipeline status

  2. Verify MySQL connectivity from LogStash

  3. Test SQL query manually

  4. Check ClickHouse accessibility

  5. Verify table schemas match

Scenario 2: Performance Degradation

  1. Monitor resource usage trends

  2. Analyze slow query logs

  3. Check for data volume increases

  4. Review index usage

  5. Optimize configurations

Scenario 3: Data Quality Issues

  1. Compare source and destination counts

  2. Check for duplicate records

  3. Validate data transformations

  4. Review filter logic

  5. Test with smaller datasets

This systematic approach to debugging will help you quickly identify and resolve issues in your DataSuite ETL pipeline.

Last updated

Was this helpful?