Common Issues

This guide covers frequently encountered problems in the DataSuite ETL pipeline and their solutions.

Service Connection Issues

MySQL Connection Problems

Issue: Access denied for user 'root'@'%'

# Solution: Reset MySQL password
docker exec mysql mysql -uroot -e "ALTER USER 'root'@'%' IDENTIFIED BY 'password';"
docker restart mysql

Issue: Can't connect to MySQL server

# Check if MySQL is running
docker ps | grep mysql

# Check MySQL logs
docker logs mysql --tail 50

# Test connection
docker exec mysql mysqladmin ping -uroot -ppassword

ClickHouse Access Issues

Issue: Authentication failed

# Test connection without auth
curl http://localhost:8123/ping

# Check user configuration
curl -u admin:clickhouse123 http://localhost:8123/ -d "SELECT user(), currentDatabase()"

Issue: Connection refused

# Check if ClickHouse is listening
docker exec clickhouse netstat -tuln | grep 8123

# Check ClickHouse logs
docker logs clickhouse --tail 50

LogStash Pipeline Issues

JDBC Connection Errors

Issue: No suitable driver found

# Verify JDBC driver is mounted
docker exec logstash ls -la /usr/share/logstash/drivers/

# Check LogStash can load the driver
docker exec logstash java -cp /usr/share/logstash/drivers/mysql-connector-java.jar com.mysql.cj.jdbc.Driver

Issue: Pipeline not executing

# Check pipeline configuration
curl -s http://localhost:9600/_node/stats/pipelines | jq

# View LogStash logs
docker logs logstash -f --tail 100

Data Quality Issues

Missing Data

Issue: Records not appearing in ClickHouse

# Check LogStash is reading from MySQL
docker logs logstash | grep "SELECT"

# Verify LogStash output
docker logs logstash | grep "INSERT INTO"

# Check ClickHouse for errors
curl -u admin:clickhouse123 http://localhost:8123/ -d "SELECT COUNT(*) FROM system.errors WHERE value > 0"

Duplicate Records

Issue: Duplicate data in bronze layer

-- Find duplicates in ClickHouse
SELECT 
    salesorderid,
    COUNT(*) as duplicate_count
FROM bronze_layer.sales_orders 
GROUP BY salesorderid 
HAVING COUNT(*) > 1 
ORDER BY duplicate_count DESC;

Solution: Implement deduplication in LogStash

filter {
  fingerprint {
    source => ["salesorderid", "modifieddate"]
    target => "[@metadata][fingerprint]"
  }
}

output {
  http {
    # Use fingerprint to prevent duplicates
    message => "INSERT INTO bronze_layer.sales_orders (*) VALUES (...) ON DUPLICATE KEY UPDATE ..."
  }
}

Performance Issues

Slow Query Performance

Issue: ClickHouse queries taking too long

-- Check query statistics
SELECT 
    query,
    query_duration_ms,
    memory_usage
FROM system.query_log 
WHERE query_duration_ms > 1000 
ORDER BY query_duration_ms DESC 
LIMIT 10;

Solutions:

  • Add appropriate indexes

  • Optimize table partitioning

  • Use PREWHERE instead of WHERE

  • Consider materialized views for frequent queries

Memory Issues

Issue: Containers running out of memory

# Monitor memory usage
docker stats --no-stream

# Check container limits
docker inspect mysql | grep -i memory

Solutions:

  • Increase Docker memory allocation

  • Optimize query batch sizes

  • Add memory limits to prevent OOM

DBT Issues

Model Compilation Errors

Issue: Compilation Error in model

# Debug DBT configuration
dbt debug

# Compile models to check for errors
dbt compile

# Run specific model
dbt run --models stg_sales_orders

Test Failures

Issue: DBT tests failing

# Run tests with verbose output
dbt test --store-failures

# Check failed test results
SELECT * FROM analytics.dbt_test__audit__unique_fact_sale_sale_key;

Network and Docker Issues

Port Conflicts

Issue: Port already in use

# Find process using port
lsof -i :3306

# Kill process
kill -9 <PID>

# Or change port mapping in docker-compose.yml

Container Communication

Issue: Services can't reach each other

# Check Docker network
docker network inspect datasuite-network

# Test connectivity between containers
docker exec logstash nc -zv mysql 3306

Troubleshooting Workflow

  1. Identify the layer where the issue occurs (source, ingestion, transformation, serving)

  2. Check service health using health check commands

  3. Review logs for error messages and patterns

  4. Test connectivity between services

  5. Validate data at each pipeline stage

  6. Monitor resources (CPU, memory, disk, network)

Getting Help

  • Check service logs first: docker logs <container-name>

  • Use health check endpoints: /ping, /health, /_node/stats

  • Monitor system resources: docker stats

  • Test individual components in isolation

  • Review configuration files for typos or incorrect values

Most issues can be resolved by systematically checking each component and its connections to other services.

Last updated

Was this helpful?