Apache Kafka 3.7.2 (Amazon Linux 2023) AMI Administrator Guide

1. Quick Start Information

Connection Methods:

Access the instance via SSH using the ec2-user user. Use sudo to run commands requiring root privileges. To switch to the root user, use sudo su - root.

Install Information:

OS: Amazon Linux 2023
Kafka version: 3.7.2
Scala version: 2.13
Java: Amazon Corretto 17 (AWS-optimized OpenJDK 17)
Java Home: /usr/lib/jvm/java-17-amazon-corretto
Mode: KRaft (no ZooKeeper required)
Install Directory: /opt/kafka (symlink → /opt/kafka_2.13-3.7.2)
Service User: kafka (system user, no login shell)
Default Port: 9092

Kafka Service Management:

Start Kafka service: sudo systemctl start kafka
Stop Kafka service: sudo systemctl stop kafka
Restart Kafka service: sudo systemctl restart kafka
Check Kafka status: sudo systemctl status kafka
Enable auto-start: sudo systemctl enable kafka

Quick Verification Commands:

Check Kafka version: /opt/kafka/bin/kafka-topics.sh --version
Check Java version: java -version
View Kafka logs: sudo journalctl -u kafka -f

Firewall Configuration:

Please allow SSH port 22.
Allow Kafka port 9092 if accessing from external clients or applications.
For security, it is recommended to limit access to trusted IPs only.

2. First Launch & Verification

Step 1: Connect to Your Instance

Launch your instance in your cloud provider's console (e.g., AWS EC2)
Ensure SSH port 22 is allowed in your security group

Connect via SSH:

ssh -i your-key.pem ec2-user@YOUR_PUBLIC_IP

Step 2: Verify Java Installation

Check Amazon Corretto 17:

java -version

Expected Output:

openjdk version "17.0.x" 2024-xx-xx LTS
OpenJDK Runtime Environment Corretto-17.x.x.x (build 17.0.x+x-LTS)
OpenJDK 64-Bit Server VM Corretto-17.x.x.x (build 17.0.x+x-LTS, mixed mode, sharing)

Confirm Corretto-17 is shown in the output.

Step 3: Verify Kafka Service Status

Check if Kafka daemon is running:

sudo systemctl status kafka --no-pager

Expected Output:

● kafka.service - Apache Kafka 3.7.2 Server (KRaft Mode)
     Loaded: loaded (/etc/systemd/system/kafka.service; enabled; preset: disabled)
     Active: active (running) since ...
   Main PID: xxxx (java)

Step 4: Verify Kafka Functionality

List available topics:

/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Expected Output:

aws-marketplace-test

Create a new test topic:

/opt/kafka/bin/kafka-topics.sh --create \
  --topic my-test-topic \
  --bootstrap-server localhost:9092 \
  --partitions 1 \
  --replication-factor 1

Expected Output:

Created topic my-test-topic.

Step 5: Test Produce and Consume Messages

Open a producer in one terminal:

/opt/kafka/bin/kafka-console-producer.sh \
  --topic my-test-topic \
  --bootstrap-server localhost:9092

Type a message and press Enter, then Ctrl+C to exit.

Open a consumer in another terminal:

/opt/kafka/bin/kafka-console-consumer.sh \
  --topic my-test-topic \
  --bootstrap-server localhost:9092 \
  --from-beginning

Expected Output: The message you typed should appear.

3. Architecture & Detailed Configuration

This AMI runs Apache Kafka 3.7.2 in KRaft mode (Kafka Raft Metadata mode), which eliminates the dependency on Apache ZooKeeper. KRaft mode is the modern, recommended way to run Kafka and has been production-ready since Kafka 3.3.

Installation Architecture:

[Amazon Corretto 17]
         ↓
[Kafka 3.7.2 (Scala 2.13)]
/opt/kafka_2.13-3.7.2/ ← actual directory
/opt/kafka/            ← symlink (used in all configs)
         ↓
[KRaft Mode - No ZooKeeper]
/opt/kafka/config/kraft/server.properties → KRaft configuration
         ↓
[Cluster ID + Formatted Storage]
kafka-storage.sh format → initializes data directory
         ↓
[Systemd Service]
/etc/systemd/system/kafka.service → Auto-start on boot
         ↓
[Service User: kafka]
No login shell → runs with minimal privileges

Key Design Decisions:

KRaft Mode: Eliminates ZooKeeper dependency — simpler architecture, fewer components to manage
Amazon Corretto 17: AWS-optimized JDK with long-term support, explicit path set in service file to prevent version drift
Symlink Strategy: /opt/kafka → /opt/kafka_2.13-3.7.2 allows version upgrades without changing configs or service files
Dedicated kafka User: Service runs as a restricted system user (no login shell) for security
Heap Tuning: 1G heap (-Xmx1G -Xms1G) optimized for t3.small/t3.medium instances

Why KRaft Over ZooKeeper?

Feature	ZooKeeper Mode	KRaft Mode
Components	Kafka + ZooKeeper	Kafka only
Complexity	High	Low
Metadata storage	ZooKeeper	Kafka itself
Production ready	Legacy	Recommended (3.3+)
Future support	Deprecated	Active development

3.1. Systemd Service File

File Location: /etc/systemd/system/kafka.service

Complete Contents:

[Unit]
Description=Apache Kafka 3.7.2 Server (KRaft Mode)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target
After=network.target

[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-17-amazon-corretto"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure

[Install]
WantedBy=multi-user.target

How This Works:

Environment="JAVA_HOME=...": Explicitly pins Java 17 Corretto path — prevents issues if multiple JDKs are installed or system defaults change
Environment="KAFKA_HEAP_OPTS=...": Sets JVM heap to 1G min/max — eliminates heap resizing overhead and suitable for t3.small/medium
User=kafka / Group=kafka: Runs as dedicated system user for security isolation
Restart=on-failure: Automatically restarts Kafka if it crashes unexpectedly
Type=simple: Systemd treats the first process as the main process (suitable for Kafka's foreground mode)

3.2. KRaft Configuration File

File Location: /opt/kafka/config/kraft/server.properties

Key Settings:

# The role of this server. KRaft mode uses 'broker,controller' for combined mode
process.roles=broker,controller

# The node id for this server
node.id=1

# The connect string for the KRaft controller quorum
controller.quorum.voters=1@localhost:9093

# Listeners
listeners=PLAINTEXT://:9092,CONTROLLER://:9093
advertised.listeners=PLAINTEXT://localhost:9092

# Log directories
log.dirs=/tmp/kraft-combined-logs

How This Works:

process.roles=broker,controller: Combined mode — single node acts as both broker and controller
controller.quorum.voters: Defines the KRaft quorum (cluster membership)
listeners: Port 9092 for client connections, port 9093 for internal KRaft communication
log.dirs: Where Kafka stores message data (formatted during setup)

4. How-To-Create: Reproduce This Environment

This section explains how this AMI was built, allowing you to reproduce the installation on any Amazon Linux 2023 system.

Step 1: Update the System

Purpose: Ensure a clean, up-to-date base before installing software.

sudo dnf update -y

Step 2: Install Amazon Corretto 17

Purpose: Install AWS's production-grade, long-term-supported OpenJDK distribution.

sudo dnf install java-17-amazon-corretto-devel -y

How This Works:

java-17-amazon-corretto-devel: Installs the full JDK (compiler + runtime), not just the JRE
Amazon Corretto 17 receives security patches from AWS through 2029+
Installs to /usr/lib/jvm/java-17-amazon-corretto/

Verify:

java -version

Expected Output (example):

openjdk version "17.0.13" 2024-10-15 LTS
OpenJDK Runtime Environment Corretto-17.0.13.11.1 (build 17.0.13+11-LTS)
OpenJDK 64-Bit Server VM Corretto-17.0.13.11.1 (build 17.0.13+11-LTS, mixed mode, sharing)

Step 3: Download and Extract Kafka

Purpose: Obtain the official Kafka 3.7.2 binary distribution.

wget https://downloads.apache.org/kafka/3.7.2/kafka_2.13-3.7.2.tgz
sudo tar -xzf kafka_2.13-3.7.2.tgz -C /opt

How This Works:

kafka_2.13-3.7.2.tgz: Kafka built with Scala 2.13 (current stable Scala version)
Extracted to /opt/kafka_2.13-3.7.2/ containing all binaries, configs, and scripts

Step 4: Create Symbolic Link

Purpose: Create a stable /opt/kafka path that remains constant across version upgrades.

sudo ln -s /opt/kafka_2.13-3.7.2 /opt/kafka

How This Works:

When upgrading Kafka in the future:

Extract new version: sudo tar -xzf kafka_2.13-3.8.0.tgz -C /opt
Update symlink: sudo ln -sfn /opt/kafka_2.13-3.8.0 /opt/kafka
Restart service: sudo systemctl restart kafka

No changes needed to the service file or any scripts.

Step 5: Create the kafka System User

Purpose: Run Kafka as a dedicated, unprivileged system user for security.

sudo useradd -r -s /bin/false kafka

How This Works:

-r: Creates a system account (lower UID range, no home directory by default)
-s /bin/false: No login shell — the user cannot log in interactively
This limits the blast radius if Kafka is compromised

Step 6: Set File Ownership

Purpose: Give the kafka user ownership of all Kafka files.

sudo chown -R kafka:kafka /opt/kafka_2.13-3.7.2
sudo chown -h kafka:kafka /opt/kafka

How This Works:

-R: Recursively sets ownership on all files in the Kafka directory
-h: Sets ownership on the symlink itself (not the target)

Step 7: Initialize KRaft Storage

Purpose: Generate a unique Cluster ID and format the storage directory for KRaft mode.

sudo -u kafka bash << 'EOF'
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
echo "Cluster ID: $KAFKA_CLUSTER_ID"
/opt/kafka/bin/kafka-storage.sh format \
  -t $KAFKA_CLUSTER_ID \
  -c /opt/kafka/config/kraft/server.properties
EOF

How This Works:

sudo -u kafka bash: Executes the block as the kafka user — ensures storage is owned correctly
kafka-storage.sh random-uuid: Generates a globally unique cluster identifier
kafka-storage.sh format: Writes the cluster metadata to the log directory defined in server.properties

Expected Output:

Cluster ID: <uuid>
Formatting /tmp/kraft-combined-logs with metadata.version 3.7-IV4.

Step 8: Create Systemd Service

Purpose: Register Kafka as a system service that starts automatically on boot.

sudo tee /etc/systemd/system/kafka.service > /dev/null << 'EOF'
[Unit]
Description=Apache Kafka 3.7.2 Server (KRaft Mode)
Documentation=http://kafka.apache.org/documentation.html
Requires=network.target
After=network.target

[Service]
Type=simple
User=kafka
Group=kafka
Environment="JAVA_HOME=/usr/lib/jvm/java-17-amazon-corretto"
Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"
ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft/server.properties
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable --now kafka

How This Works:

daemon-reload: Tells systemd to re-read all service unit files
enable --now: Enables the service for auto-start AND starts it immediately in one command

Step 9: Verify Installation

# Check service status
sudo systemctl status kafka --no-pager

# Create test topic
/opt/kafka/bin/kafka-topics.sh --create \
  --topic aws-marketplace-test \
  --bootstrap-server localhost:9092 \
  --partitions 1 \
  --replication-factor 1

# List topics
/opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Expected Output:

Active: active (running) ...
Created topic aws-marketplace-test.
aws-marketplace-test

5. Using the Kafka Environment

5.1. Topic Management

# Create a topic
/opt/kafka/bin/kafka-topics.sh --create \
  --topic my-topic \
  --bootstrap-server localhost:9092 \
  --partitions 3 \
  --replication-factor 1

# Describe a topic
/opt/kafka/bin/kafka-topics.sh --describe \
  --topic my-topic \
  --bootstrap-server localhost:9092

# Delete a topic
/opt/kafka/bin/kafka-topics.sh --delete \
  --topic my-topic \
  --bootstrap-server localhost:9092

# List all topics
/opt/kafka/bin/kafka-topics.sh --list \
  --bootstrap-server localhost:9092

5.2. Producing and Consuming Messages

Produce messages:

/opt/kafka/bin/kafka-console-producer.sh \
  --topic my-topic \
  --bootstrap-server localhost:9092

Type messages and press Enter for each. Use Ctrl+C to exit.

Consume messages:

# Consume from the beginning
/opt/kafka/bin/kafka-console-consumer.sh \
  --topic my-topic \
  --bootstrap-server localhost:9092 \
  --from-beginning

# Consume only new messages
/opt/kafka/bin/kafka-console-consumer.sh \
  --topic my-topic \
  --bootstrap-server localhost:9092

5.3. Consumer Groups

# List consumer groups
/opt/kafka/bin/kafka-consumer-groups.sh \
  --list \
  --bootstrap-server localhost:9092

# Describe a consumer group (check lag)
/opt/kafka/bin/kafka-consumer-groups.sh \
  --describe \
  --group my-group \
  --bootstrap-server localhost:9092

5.4. Monitoring

# View broker metadata
/opt/kafka/bin/kafka-metadata-quorum.sh \
  --bootstrap-server localhost:9092 describe --status

# Check cluster information
/opt/kafka/bin/kafka-broker-api-versions.sh \
  --bootstrap-server localhost:9092

6. Important File Locations

File Path	Purpose
`/opt/kafka`	Kafka installation symlink
`/opt/kafka_2.13-3.7.2/`	Actual Kafka installation directory
`/opt/kafka/bin/`	Kafka shell scripts (kafka-topics.sh, etc.)
`/opt/kafka/config/kraft/server.properties`	KRaft mode configuration
`/opt/kafka/config/kraft/`	KRaft configuration directory
`/tmp/kraft-combined-logs/`	Kafka data directory (messages + metadata)
`/etc/systemd/system/kafka.service`	Systemd service file
`/usr/lib/jvm/java-17-amazon-corretto/`	Amazon Corretto 17 Java home

7. Troubleshooting

Issue 1: Kafka Service Fails to Start

Symptoms:

$ sudo systemctl status kafka
Active: failed (Result: exit-code)

Diagnosis:

View detailed logs:

sudo journalctl -u kafka -n 50 --no-pager

Common Causes:

Storage not formatted (KRaft mode requires initialization):

sudo -u kafka bash << 'EOF'
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
/opt/kafka/bin/kafka-storage.sh format \
  -t $KAFKA_CLUSTER_ID \
  -c /opt/kafka/config/kraft/server.properties
EOF
sudo systemctl start kafka

Wrong JAVA_HOME path:

ls /usr/lib/jvm/

Update the Environment="JAVA_HOME=..." line in /etc/systemd/system/kafka.service to match the actual path, then reload:

sudo systemctl daemon-reload
sudo systemctl start kafka

Port 9092 already in use:

sudo lsof -i :9092

Issue 2: Cannot Connect to Kafka

Symptoms:

$ /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
[ERROR] ... Connection refused

Diagnosis:

Check if Kafka is running:

sudo systemctl status kafka

Check if port 9092 is listening:

sudo ss -tlnp | grep 9092

Solution:

Start the service if not running:

sudo systemctl start kafka

Issue 3: Out of Memory Error

Symptoms:

java.lang.OutOfMemoryError: Java heap space

Diagnosis:

Check current heap setting:

grep KAFKA_HEAP_OPTS /etc/systemd/system/kafka.service

Solution:

Increase heap size in the service file (for larger instances):

sudo nano /etc/systemd/system/kafka.service

Change:

Environment="KAFKA_HEAP_OPTS=-Xmx1G -Xms1G"

To (for t3.large or larger):

Environment="KAFKA_HEAP_OPTS=-Xmx2G -Xms2G"

Reload and restart:

sudo systemctl daemon-reload
sudo systemctl restart kafka

Issue 4: Topics Not Persisting After Restart

Symptoms:

Topics created before a restart disappear.

Diagnosis:

Check the data directory:

ls /tmp/kraft-combined-logs/

Cause:

The default log.dirs=/tmp/kraft-combined-logs in server.properties uses the /tmp directory, which may be cleared on reboot.

Solution:

Change the log directory to a persistent path:

sudo mkdir -p /var/lib/kafka/data
sudo chown -R kafka:kafka /var/lib/kafka
sudo nano /opt/kafka/config/kraft/server.properties

Update:

log.dirs=/var/lib/kafka/data

Re-format storage with the new path:

sudo systemctl stop kafka
sudo -u kafka bash << 'EOF'
KAFKA_CLUSTER_ID=$(/opt/kafka/bin/kafka-storage.sh random-uuid)
/opt/kafka/bin/kafka-storage.sh format \
  -t $KAFKA_CLUSTER_ID \
  -c /opt/kafka/config/kraft/server.properties
EOF
sudo systemctl start kafka

8. Final Notes

Key Takeaways

Kafka 3.7.2 running in KRaft mode — no ZooKeeper required
Amazon Corretto 17 as the Java runtime — AWS-optimized with long-term support
Symlink strategy (/opt/kafka) enables easy version upgrades
Dedicated kafka user for security isolation
The installation is production-ready and AMI-optimized with auto-start enabled

Kafka Use Cases

Event Streaming: Real-time data pipelines between services
Log Aggregation: Collect and centralize logs from distributed systems
Message Queuing: Decouple producers and consumers in microservices
Stream Processing: Integrate with Kafka Streams or Apache Flink
Activity Tracking: User behavior events for analytics

Recommended Instance Types

Workload	Instance	Reason
Development / Testing	t3.small	Low cost, 1G heap fits
Small production	t3.medium	2 vCPU, good throughput
Medium production	t3.large	2G heap, higher throughput
High throughput	m5.xlarge+	Dedicated compute

Additional Resources

Kafka Documentation: https://kafka.apache.org/documentation/
KRaft Mode Guide: https://kafka.apache.org/documentation/#kraft
Amazon Corretto: https://aws.amazon.com/corretto/

For support or questions, please contact the Easycloud team.

1. Quick Start Information​

2. First Launch & Verification​

Step 1: Connect to Your Instance​

Step 2: Verify Java Installation​

Step 3: Verify Kafka Service Status​

Step 4: Verify Kafka Functionality​

Step 5: Test Produce and Consume Messages​

3. Architecture & Detailed Configuration​

3.1. Systemd Service File​

3.2. KRaft Configuration File​

4. How-To-Create: Reproduce This Environment​

Step 1: Update the System​

Step 2: Install Amazon Corretto 17​

Step 3: Download and Extract Kafka​

Step 4: Create Symbolic Link​

Step 5: Create the kafka System User​

Step 6: Set File Ownership​

Step 7: Initialize KRaft Storage​

Step 8: Create Systemd Service​

Step 9: Verify Installation​

5. Using the Kafka Environment​

5.1. Topic Management​

5.2. Producing and Consuming Messages​

5.3. Consumer Groups​

5.4. Monitoring​

6. Important File Locations​

7. Troubleshooting​

Issue 1: Kafka Service Fails to Start​

Issue 2: Cannot Connect to Kafka​

Issue 3: Out of Memory Error​

Issue 4: Topics Not Persisting After Restart​

8. Final Notes​

Key Takeaways​

Kafka Use Cases​

Recommended Instance Types​

Additional Resources​

1. Quick Start Information

2. First Launch & Verification

Step 1: Connect to Your Instance

Step 2: Verify Java Installation

Step 3: Verify Kafka Service Status

Step 4: Verify Kafka Functionality

Step 5: Test Produce and Consume Messages

3. Architecture & Detailed Configuration

3.1. Systemd Service File

3.2. KRaft Configuration File

4. How-To-Create: Reproduce This Environment

Step 1: Update the System

Step 2: Install Amazon Corretto 17

Step 3: Download and Extract Kafka

Step 4: Create Symbolic Link

Step 5: Create the kafka System User

Step 6: Set File Ownership

Step 7: Initialize KRaft Storage

Step 8: Create Systemd Service

Step 9: Verify Installation

5. Using the Kafka Environment

5.1. Topic Management

5.2. Producing and Consuming Messages

5.3. Consumer Groups

5.4. Monitoring

6. Important File Locations

7. Troubleshooting

Issue 1: Kafka Service Fails to Start

Issue 2: Cannot Connect to Kafka

Issue 3: Out of Memory Error

Issue 4: Topics Not Persisting After Restart

8. Final Notes

Key Takeaways

Kafka Use Cases

Recommended Instance Types

Additional Resources