XiaoClaw - ESP32-S3 AI Voice Assistant

ESP32-S3 AI Voice Assistant

运行于 ESP32-S3 的本地 AI Agent 固件，集成离线语音唤醒与云端 TTS 服务，支持本地大模型推理、工具调用、长期记忆存储与自主任务执行。

Get Started View on GitHub

32MB Flash

8MB PSRAM

70+ Supported Boards

Scroll to explore

Features

Two powerful layers working in harmony

Voice I/O Layer

xiaozhi-esp32

Offline wake word detection (ESP-SR)
Streaming ASR + TTS via server
OPUS audio codec
OLED/LCD display with emoji
Battery & power management
Multi-language support

Agent Brain Layer

mimiclaw

LLM API (Claude / GPT)
ReAct tool calling
Long-term memory (SPIFFS)
Session consolidation
Cron scheduler
Web search capability

Architecture

How the components work together

Voice I/O (xiaozhi)

Wake Word

ASR

TTS

Display

WiFi

Bridge Layer

Agent Brain (mimiclaw)

LLM API

Tools

Memory

Sessions

Cron

Supported Hardware

Compatible with 70+ ESP32-S3 boards

📦

ESP32-S3-BOX3

🖥️

M5Stack CoreS3

⚡

AtomS3R

🔧

LiChuang DevBoard

⭕

T-Circle-S3

And 70+ more...

Quick Start

Get up and running in 3 steps

Clone & Configure

Clone the repository and set your target ESP32-S3 board

git clone https://github.com/beancookie/xiaoclaw.git
cd xiaoclaw
idf.py set-target esp32s3

Configure Secrets

Set up your WiFi and API keys via menuconfig

idf.py menuconfig

Build & Flash

Build the firmware and flash to your device

idf.py build
idf.py -p PORT flash monitor

基于以下优秀项目构建

XiaoClaw combines the best of both worlds

xiaozhi-esp32

语音交互框架 — 语音采集、回放、唤醒词、显示屏、网络通信

View on GitHub

mimiclaw

ESP32 AI Agent — LLM 推理、工具调用、记忆管理、自主任务执行

View on GitHub

GitHub

Source code & issues

xiaozhi-esp32

Voice interaction framework

mimiclaw

ESP32 AI agent