Service Verification

After completing your DataSuite ETL setup, use this guide to verify that all services are running correctly and can communicate with each other.

Quick Verification Commands

Run these commands to quickly check service health:

# 1. Check MySQL
docker exec mysql mysql -uroot -ppassword -e "SHOW DATABASES;"

# 2. Check ClickHouse  
curl -u admin:clickhouse123 http://localhost:8123/ping

# 3. Check LogStash
curl http://localhost:9600/_node/stats

# 4. Check Airflow (if installed)
curl http://localhost:8080/health

Expected Results:

  • MySQL: Shows list of databases including adventureworks

  • ClickHouse: Returns Ok.

  • LogStash: Returns JSON stats

  • Airflow: Returns health status JSON

Detailed Service Verification

1. MySQL Database Verification

Check Database Status:

Expected Output: mysqld is alive

Verify AdventureWorks Data:

Expected Output:

Test Sample Query:

2. ClickHouse Data Warehouse Verification

Check Service Health:

Expected Output: Ok.

Verify Database Creation:

Expected Output:

Check Table Structure:

Test Query Performance:

3. LogStash Data Pipeline Verification

Check Pipeline Status:

Monitor Pipeline Activity:

Expected Log Patterns:

Verify Data Ingestion:

Test Incremental Loading:

4. Network Connectivity Verification

Test Inter-Service Communication:

DNS Resolution Test:

5. Data Quality Verification

Row Count Reconciliation:

Data Integrity Checks:

Sample Data Verification:

Performance Verification

Query Response Time Testing

ClickHouse Performance Test:

Expected Response Time: < 1 second for typical analytical queries

Resource Usage Monitoring

Container Resource Usage:

Expected Resource Usage:

  • MySQL: 5-10% CPU, 200-500MB memory

  • ClickHouse: 10-20% CPU, 1-2GB memory

  • LogStash: 5-15% CPU, 512MB-1GB memory

Web Interface Verification

ClickHouse Play Interface

  1. Navigate to http://localhost:8123/play

  2. Login with username: admin, password: clickhouse123

  3. Run test query:

Airflow Web Interface (if installed)

  1. Navigate to http://localhost:8080

  2. Login with username: admin, password: admin

  3. Verify DAGs are visible and scheduler is running

Automated Verification Script

Create scripts/verify-installation.sh for automated checks:

Troubleshooting Failed Verifications

MySQL Connection Issues

ClickHouse Access Issues

LogStash Pipeline Issues

Next Steps

Once all verifications pass:

  1. LogStash Configuration - Customize data ingestion pipelines

  2. DBT Getting Started - Build your first data transformations

  3. Troubleshooting Guide - Reference for resolving issues

Verification Checklist

Core Services βœ…

Data Pipeline βœ…

Performance βœ…

Access βœ…

Your DataSuite ETL system is now verified and ready for development!

Last updated