Apache Kafka 3.7.2 (Amazon Linux 2023) AMI Administrator Guide
1. Quick Start Information
Connection Methods:
- Access the instance via SSH using the
ec2-useruser. Usesudoto run commands requiring root privileges. To switch to the root user, usesudo su - root.
Install Information:
- OS: Amazon Linux 2023
- Kafka version: 3.7.2
- Scala version: 2.13
- Java: Amazon Corretto 17 (AWS-optimized OpenJDK 17)
- Java Home:
/usr/lib/jvm/java-17-amazon-corretto - Mode: KRaft (no ZooKeeper required)
- Install Directory:
/opt/kafka(symlink →/opt/kafka_2.13-3.7.2) - Service User:
kafka(system user, no login shell) - Default Port: 9092
Kafka Service Management:
- Start Kafka service:
sudo systemctl start kafka - Stop Kafka service:
sudo systemctl stop kafka - Restart Kafka service:
sudo systemctl restart kafka - Check Kafka status:
sudo systemctl status kafka - Enable auto-start:
sudo systemctl enable kafka
Quick Verification Commands:
- Check Kafka version:
/opt/kafka/bin/kafka-topics.sh --version - Check Java version:
java -version - View Kafka logs:
sudo journalctl -u kafka -f
Firewall Configuration:
- Please allow SSH port 22.
- Allow Kafka port 9092 if accessing from external clients or applications.
- For security, it is recommended to limit access to trusted IPs only.
2. First Launch & Verification
Step 1: Connect to Your Instance
- Launch your instance in your cloud provider's console (e.g., AWS EC2)
- Ensure SSH port 22 is allowed in your security group
- Connect via SSH:
ssh -i your-key.pem ec2-user@YOUR_PUBLIC_IP
Step 2: Verify Java Installation
Check Amazon Corretto 17:
java -version
Expected Output:
openjdk version "17.0.x" 2024-xx-xx LTS
OpenJDK Runtime Environment Corretto-17.x.x.x (build 17.0.x+x-LTS)
OpenJDK 64-Bit Server VM Corretto-17.x.x.x (build 17.0.x+x-LTS, mixed mode, sharing)
Confirm Corretto-17 is shown in the output.
Step 3: Verify Kafka Service Status
Check if Kafka daemon is running:
sudo systemctl status kafka --no-pager
Expected Output:
● kafka.service - Apache Kafka 3.7.2 Server (KRaft Mode)
Loaded: loaded (/etc/systemd/system/kafka.service; enabled; preset: disabled)
Active: active (running) since ...
Main PID: xxxx (java)
Step 4: Verify Kafka Functionality
List available topics:
/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Expected Output:
aws-marketplace-test
Create a new test topic:
/opt/kafka/bin/kafka-topics.sh --create \
--topic my-test-topic \
--bootstrap-server localhost:9092 \
--partitions 1 \
--replication-factor 1
Expected Output:
Created topic my-test-topic.
Step 5: Test Produce and Consume Messages
Open a producer in one terminal:
/opt/kafka/bin/kafka-console-producer.sh \
--topic my-test-topic \
--bootstrap-server localhost:9092
Type a message and press Enter, then Ctrl+C to exit.
Open a consumer in another terminal:
/opt/kafka/bin/kafka-console-consumer.sh \
--topic my-test-topic \
--bootstrap-server localhost:9092 \
--from-beginning
Expected Output: The message you typed should appear.
3. Architecture & Detailed Configuration
This AMI runs Apache Kafka 3.7.2 in KRaft mode (Kafka Raft Metadata mode), which eliminates the dependency on Apache ZooKeeper. KRaft mode is the modern, recommended way to run Kafka and has been production-ready since Kafka 3.3.
Installation Architecture:
[Amazon Corretto 17]
↓
[Kafka 3.7.2 (Scala 2.13)]
/opt/kafka_2.13-3.7.2/ ← actual directory
/opt/kafka/ ← symlink (used in all configs)
↓
[KRaft Mode - No ZooKeeper]
/opt/kafka/config/kraft/server.properties → KRaft configuration
↓
[Cluster ID + Formatted Storage]
kafka-storage.sh format → initializes data directory
↓
[Systemd Service]
/etc/systemd/system/kafka.service → Auto-start on boot
↓
[Service User: kafka]
No login shell → runs with minimal privileges
Key Design Decisions:
- KRaft Mode: Eliminates ZooKeeper dependency — simpler architecture, fewer components to manage
- Amazon Corretto 17: AWS-optimized JDK with long-term support, explicit path set in service file to prevent version drift
- Symlink Strategy:
/opt/kafka→/opt/kafka_2.13-3.7.2allows version upgrades without changing configs or service files - Dedicated
kafkaUser: Service runs as a restricted system user (no login shell) for security - Heap Tuning: 1G heap (
-Xmx1G -Xms1G) optimized for t3.small/t3.medium instances
Why KRaft Over ZooKeeper?
| Feature | ZooKeeper Mode | KRaft Mode |
|---|---|---|
| Components | Kafka + ZooKeeper | Kafka only |
| Complexity | High | Low |
| Metadata storage | ZooKeeper | Kafka itself |
| Production ready | Legacy | Recommended (3.3+) |
| Future support | Deprecated | Active development |
3.1. Systemd Service File
File Location: /etc/systemd/system/kafka.service
Complete Contents:
[Unit]
Description=Apache Kafka 3.7.2 Server (KRaft Mode)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target
After=network.target
[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-17-amazon-corretto"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
[Install]
WantedBy=multi-user.target
How This Works:
Environment="JAVA_HOME=...": Explicitly pins Java 17 Corretto path — prevents issues if multiple JDKs are installed or system defaults changeEnvironment="KAFKA_HEAP_OPTS=...": Sets JVM heap to 1G min/max — eliminates heap resizing overhead and suitable for t3.small/mediumUser=kafka / Group=kafka: Runs as dedicated system user for security isolationRestart=on-failure: Automatically restarts Kafka if it crashes unexpectedlyType=simple: Systemd treats the first process as the main process (suitable for Kafka's foreground mode)
3.2. KRaft Configuration File
File Location: /opt/kafka/config/kraft/server.properties
Key Settings:
# The role of this server. KRaft mode uses 'broker,controller' for combined mode
process.roles=broker,controller
# The node id for this server
node.id=1
# The connect string for the KRaft controller quorum
controller.quorum.voters=1@localhost:9093
# Listeners
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://localhost:9092
# Log directories
log.dirs=/tmp/kraft-combined-logs
How This Works:
process.roles=broker,controller: Combined mode — single node acts as both broker and controllercontroller.quorum.voters: Defines the KRaft quorum (cluster membership)listeners: Port 9092 for client connections, port 9093 for internal KRaft communicationlog.dirs: Where Kafka stores message data (formatted during setup)
4. How-To-Create: Reproduce This Environment
This section explains how this AMI was built, allowing you to reproduce the installation on any Amazon Linux 2023 system.
Step 1: Update the System
Purpose: Ensure a clean, up-to-date base before installing software.
sudo dnf update -y
Step 2: Install Amazon Corretto 17
Purpose: Install AWS's production-grade, long-term-supported OpenJDK distribution.
sudo dnf install java-17-amazon-corretto-devel -y
How This Works:
java-17-amazon-corretto-devel: Installs the full JDK (compiler + runtime), not just the JRE- Amazon Corretto 17 receives security patches from AWS through 2029+
- Installs to
/usr/lib/jvm/java-17-amazon-corretto/
Verify:
java -version
Expected Output (example):
openjdk version "17.0.13" 2024-10-15 LTS
OpenJDK Runtime Environment Corretto-17.0.13.11.1 (build 17.0.13+11-LTS)
OpenJDK 64-Bit Server VM Corretto-17.0.13.11.1 (build 17.0.13+11-LTS, mixed mode, sharing)
Step 3: Download and Extract Kafka
Purpose: Obtain the official Kafka 3.7.2 binary distribution.
wget https://downloads.apache.org/kafka/3.7.2/kafka_2.13-3.7.2.tgz
sudo tar -xzf kafka_2.13-3.7.2.tgz -C /opt
How This Works:
kafka_2.13-3.7.2.tgz: Kafka built with Scala 2.13 (current stable Scala version)- Extracted to
/opt/kafka_2.13-3.7.2/containing all binaries, configs, and scripts
Step 4: Create Symbolic Link
Purpose: Create a stable /opt/kafka path that remains constant across version upgrades.
sudo ln -s /opt/kafka_2.13-3.7.2 /opt/kafka
How This Works:
When upgrading Kafka in the future:
- Extract new version:
sudo tar -xzf kafka_2.13-3.8.0.tgz -C /opt - Update symlink:
sudo ln -sfn /opt/kafka_2.13-3.8.0 /opt/kafka - Restart service:
sudo systemctl restart kafka
No changes needed to the service file or any scripts.
Step 5: Create the kafka System User
Purpose: Run Kafka as a dedicated, unprivileged system user for security.
sudo useradd -r -s /bin/false kafka
How This Works:
-r: Creates a system account (lower UID range, no home directory by default)-s /bin/false: No login shell — the user cannot log in interactively- This limits the blast radius if Kafka is compromised
Step 6: Set File Ownership
Purpose: Give the kafka user ownership of all Kafka files.
sudo chown -R kafka:kafka /opt/kafka_2.13-3.7.2
sudo chown -h kafka:kafka /opt/kafka
How This Works:
-R: Recursively sets ownership on all files in the Kafka directory-h: Sets ownership on the symlink itself (not the target)
Step 7: Initialize KRaft Storage
Purpose: Generate a unique Cluster ID and format the storage directory for KRaft mode.
sudo -u kafka bash << 'EOF'
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
echo "Cluster ID: $KAFKA_CLUSTER_ID"
/opt/kafka/bin/kafka-storage.sh format \
-t $KAFKA_CLUSTER_ID \
-c /opt/kafka/config/kraft/server.properties
EOF
How This Works:
sudo -u kafka bash: Executes the block as thekafkauser — ensures storage is owned correctlykafka-storage.sh random-uuid: Generates a globally unique cluster identifierkafka-storage.sh format: Writes the cluster metadata to the log directory defined inserver.properties
Expected Output:
Cluster ID: <uuid>
Formatting /tmp/kraft-combined-logs with metadata.version 3.7-IV4.
Step 8: Create Systemd Service
Purpose: Register Kafka as a system service that starts automatically on boot.
sudo tee /etc/systemd/system/kafka.service > /dev/null << 'EOF'
[Unit]
Description=Apache Kafka 3.7.2 Server (KRaft Mode)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target
After=network.target
[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-17-amazon-corretto"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now kafka
How This Works:
daemon-reload: Tells systemd to re-read all service unit filesenable --now: Enables the service for auto-start AND starts it immediately in one command
Step 9: Verify Installation
# Check service status
sudo systemctl status kafka --no-pager
# Create test topic
/opt/kafka/bin/kafka-topics.sh --create \
--topic aws-marketplace-test \
--bootstrap-server localhost:9092 \
--partitions 1 \
--replication-factor 1
# List topics
/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Expected Output:
Active: active (running) ...
Created topic aws-marketplace-test.
aws-marketplace-test
5. Using the Kafka Environment
5.1. Topic Management
# Create a topic
/opt/kafka/bin/kafka-topics.sh --create \
--topic my-topic \
--bootstrap-server localhost:9092 \
--partitions 3 \
--replication-factor 1
# Describe a topic
/opt/kafka/bin/kafka-topics.sh --describe \
--topic my-topic \
--bootstrap-server localhost:9092
# Delete a topic
/opt/kafka/bin/kafka-topics.sh --delete \
--topic my-topic \
--bootstrap-server localhost:9092
# List all topics
/opt/kafka/bin/kafka-topics.sh --list \
--bootstrap-server localhost:9092
5.2. Producing and Consuming Messages
Produce messages:
/opt/kafka/bin/kafka-console-producer.sh \
--topic my-topic \
--bootstrap-server localhost:9092
Type messages and press Enter for each. Use Ctrl+C to exit.
Consume messages:
# Consume from the beginning
/opt/kafka/bin/kafka-console-consumer.sh \
--topic my-topic \
--bootstrap-server localhost:9092 \
--from-beginning
# Consume only new messages
/opt/kafka/bin/kafka-console-consumer.sh \
--topic my-topic \
--bootstrap-server localhost:9092
5.3. Consumer Groups
# List consumer groups
/opt/kafka/bin/kafka-consumer-groups.sh \
--list \
--bootstrap-server localhost:9092
# Describe a consumer group (check lag)
/opt/kafka/bin/kafka-consumer-groups.sh \
--describe \
--group my-group \
--bootstrap-server localhost:9092
5.4. Monitoring
# View broker metadata
/opt/kafka/bin/kafka-metadata-quorum.sh \
--bootstrap-server localhost:9092 describe --status
# Check cluster information
/opt/kafka/bin/kafka-broker-api-versions.sh \
--bootstrap-server localhost:9092
6. Important File Locations
| File Path | Purpose |
|---|---|
/opt/kafka | Kafka installation symlink |
/opt/kafka_2.13-3.7.2/ | Actual Kafka installation directory |
/opt/kafka/bin/ | Kafka shell scripts (kafka-topics.sh, etc.) |
/opt/kafka/config/kraft/server.properties | KRaft mode configuration |
/opt/kafka/config/kraft/ | KRaft configuration directory |
/tmp/kraft-combined-logs/ | Kafka data directory (messages + metadata) |
/etc/systemd/system/kafka.service | Systemd service file |
/usr/lib/jvm/java-17-amazon-corretto/ | Amazon Corretto 17 Java home |
7. Troubleshooting
Issue 1: Kafka Service Fails to Start
Symptoms:
$ sudo systemctl status kafka
Active: failed (Result: exit-code)
Diagnosis:
View detailed logs:
sudo journalctl -u kafka -n 50 --no-pager
Common Causes:
- Storage not formatted (KRaft mode requires initialization):
sudo -u kafka bash << 'EOF'
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
/opt/kafka/bin/kafka-storage.sh format \
-t $KAFKA_CLUSTER_ID \
-c /opt/kafka/config/kraft/server.properties
EOF
sudo systemctl start kafka
- Wrong JAVA_HOME path:
ls /usr/lib/jvm/
Update the Environment="JAVA_HOME=..." line in /etc/systemd/system/kafka.service to match the actual path, then reload:
sudo systemctl daemon-reload
sudo systemctl start kafka
- Port 9092 already in use:
sudo lsof -i :9092
Issue 2: Cannot Connect to Kafka
Symptoms:
$ /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
[ERROR] ... Connection refused
Diagnosis:
Check if Kafka is running:
sudo systemctl status kafka
Check if port 9092 is listening:
sudo ss -tlnp | grep 9092
Solution:
Start the service if not running:
sudo systemctl start kafka
Issue 3: Out of Memory Error
Symptoms:
java.lang.OutOfMemoryError: Java heap space
Diagnosis:
Check current heap setting:
grep KAFKA_HEAP_OPTS /etc/systemd/system/kafka.service
Solution:
Increase heap size in the service file (for larger instances):
sudo nano /etc/systemd/system/kafka.service
Change:
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
To (for t3.large or larger):
Environment="KAFKA_HEAP_OPTS=-Xmx2G -Xms2G"
Reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart kafka
Issue 4: Topics Not Persisting After Restart
Symptoms:
Topics created before a restart disappear.
Diagnosis:
Check the data directory:
ls /tmp/kraft-combined-logs/
Cause:
The default log.dirs=/tmp/kraft-combined-logs in server.properties uses the /tmp directory, which may be cleared on reboot.
Solution:
Change the log directory to a persistent path:
sudo mkdir -p /var/lib/kafka/data
sudo chown -R kafka:kafka /var/lib/kafka
sudo nano /opt/kafka/config/kraft/server.properties
Update:
log.dirs=/var/lib/kafka/data
Re-format storage with the new path:
sudo systemctl stop kafka
sudo -u kafka bash << 'EOF'
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
/opt/kafka/bin/kafka-storage.sh format \
-t $KAFKA_CLUSTER_ID \
-c /opt/kafka/config/kraft/server.properties
EOF
sudo systemctl start kafka
8. Final Notes
Key Takeaways
- Kafka 3.7.2 running in KRaft mode — no ZooKeeper required
- Amazon Corretto 17 as the Java runtime — AWS-optimized with long-term support
- Symlink strategy (
/opt/kafka) enables easy version upgrades - Dedicated
kafkauser for security isolation - The installation is production-ready and AMI-optimized with auto-start enabled
Kafka Use Cases
- Event Streaming: Real-time data pipelines between services
- Log Aggregation: Collect and centralize logs from distributed systems
- Message Queuing: Decouple producers and consumers in microservices
- Stream Processing: Integrate with Kafka Streams or Apache Flink
- Activity Tracking: User behavior events for analytics
Recommended Instance Types
| Workload | Instance | Reason |
|---|---|---|
| Development / Testing | t3.small | Low cost, 1G heap fits |
| Small production | t3.medium | 2 vCPU, good throughput |
| Medium production | t3.large | 2G heap, higher throughput |
| High throughput | m5.xlarge+ | Dedicated compute |
Additional Resources
- Kafka Documentation: https://kafka.apache.org/documentation/
- KRaft Mode Guide: https://kafka.apache.org/documentation/#kraft
- Amazon Corretto: https://aws.amazon.com/corretto/
For support or questions, please contact the Easycloud team.