Prerequisites

Before setting up the DataSuite ETL system, ensure your development environment meets the following requirements.

System Requirements

Hardware Requirements

  • RAM: 8GB minimum, 16GB recommended for full stack

  • Storage: 20GB free disk space for containers and data

  • CPU: 4+ cores recommended for optimal performance

  • Network: Stable internet connection for downloading container images

Operating System Support

  • Linux: Ubuntu 20.04+, CentOS 8+, RHEL 8+ (recommended)

  • macOS: macOS 10.15+ with Intel or Apple Silicon

  • Windows: Windows 10/11 with WSL2 enabled

Required Software

Docker Platform

Docker Engine: Version 20.10 or later

# Check Docker version
docker --version

# Expected output: Docker version 20.10.x or later

Docker Compose: Version 2.0 or later

# Check Docker Compose version
docker-compose --version

# Expected output: Docker Compose version 2.x.x or later

Installation Resources:

Git Version Control

Git: Version 2.30 or later

# Check Git version
git --version

# Expected output: git version 2.30.x or later

Network Configuration

Port Requirements

The following ports must be available on your system:

Service
Port
Protocol
Purpose

MySQL

3306

TCP

Database connections

ClickHouse HTTP

8123

TCP

Query interface

ClickHouse Native

9000

TCP

Native protocol

LogStash

5044

TCP

Data ingestion

Airflow Web UI

8080

TCP

Workflow management

Port Conflict Check:

# Check if ports are in use (Linux/macOS)
netstat -tulpn | grep -E ':(3306|8123|9000|5044|8080)'

# If any ports show output, they're already in use

Firewall Configuration

  • Allow inbound connections on required ports for local development

  • Configure Docker daemon to access external registries

  • Ensure Docker containers can communicate with each other

Code Editor

  • VS Code with SQL and Docker extensions

  • IntelliJ IDEA with database plugins

  • Vim/Emacs with SQL syntax highlighting

Database Clients

  • DBeaver - Universal database client

  • MySQL Workbench - MySQL-specific client

  • ClickHouse Client - Native ClickHouse CLI

Terminal/Shell

  • Bash or Zsh with command completion

  • Windows PowerShell or WSL2 on Windows

Validation Checklist

Run through this checklist to ensure your environment is ready:

✅ Docker Validation

# Test Docker installation
docker run hello-world

# Expected: "Hello from Docker!" message

✅ Docker Compose Validation

# Test Docker Compose
docker-compose --version

# Expected: Version 2.x.x or later

✅ Resource Validation

# Check available memory (Linux/macOS)
free -h  # Linux
vm_stat | grep free  # macOS

# Check available disk space
df -h

✅ Network Validation

# Test internet connectivity for image downloads
docker pull hello-world

# Test local network (should show Docker bridge)
docker network ls

Common Issues and Solutions

Issue: Docker Permission Denied

Symptoms: permission denied while trying to connect to the Docker daemon socket

Solutions:

# Linux: Add user to docker group
sudo usermod -aG docker $USER
newgrp docker

# Or run with sudo (not recommended for development)
sudo docker --version

Issue: Port Already in Use

Symptoms: port is already allocated errors during startup

Solutions:

# Find process using port (Linux/macOS)
lsof -i :3306

# Kill process if safe to do so
kill -9 <PID>

# Or modify docker-compose.yml to use different ports

Issue: Insufficient Memory

Symptoms: Containers exit with OOMKilled status

Solutions:

  • Increase Docker Desktop memory allocation (Mac/Windows)

  • Close unnecessary applications

  • Reduce number of concurrent containers during development

Issue: WSL2 Configuration (Windows)

Symptoms: Docker commands not found or slow performance

Solutions:

# Enable WSL2 integration in Docker Desktop
# Increase WSL2 memory limit in .wslconfig file
echo "[wsl2]" > ~/.wslconfig
echo "memory=8GB" >> ~/.wslconfig

Performance Optimization

Docker Configuration

// Docker Desktop settings (Mac/Windows)
{
  "cpus": 4,
  "memory": 8192,
  "disk": {
    "size": 51539607552
  }
}

System Optimization

# Increase file watchers (Linux)
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf

# Increase virtual memory (Linux)
echo vm.max_map_count=262144 | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Next Steps

Once your prerequisites are met:

  1. Quick Start Setup - Get running fast with Docker Compose

  2. Advanced Setup - Custom configuration with individual containers

  3. Service Verification - Confirm everything is working correctly

Getting Help

If you encounter issues during prerequisite setup:

  • Docker Issues: Check Docker documentation

  • System Issues: Consult your operating system documentation

  • Network Issues: Check firewall and antivirus settings

  • Hardware Issues: Verify system meets minimum requirements

The setup process becomes much smoother when prerequisites are properly configured, so take time to validate each requirement before proceeding.

Last updated

Was this helpful?