ESP32-S3 AI Voice Assistant

运行于 ESP32-S3 的本地 AI Agent 固件,集成离线语音唤醒与云端 TTS 服务,支持本地大模型推理、工具调用、长期记忆存储与自主任务执行。

32MB Flash
8MB PSRAM
70+ Supported Boards
Scroll to explore

Features

Two powerful layers working in harmony

Voice I/O Layer

xiaozhi-esp32

  • Offline wake word detection (ESP-SR)
  • Streaming ASR + TTS via server
  • OPUS audio codec
  • OLED/LCD display with emoji
  • Battery & power management
  • Multi-language support

Agent Brain Layer

mimiclaw

  • LLM API (Claude / GPT)
  • ReAct tool calling
  • Long-term memory (SPIFFS)
  • Session consolidation
  • Cron scheduler
  • Web search capability

Architecture

How the components work together

Voice I/O (xiaozhi)
Wake Word
ASR
TTS
Display
WiFi
Bridge Layer
Agent Brain (mimiclaw)
LLM API
Tools
Memory
Sessions
Cron

Supported Hardware

Compatible with 70+ ESP32-S3 boards

📦

ESP32-S3-BOX3

🖥️

M5Stack CoreS3

AtomS3R

🔧

LiChuang DevBoard

T-Circle-S3

+

And 70+ more...

Quick Start

Get up and running in 3 steps

1

Clone & Configure

Clone the repository and set your target ESP32-S3 board

git clone https://github.com/beancookie/xiaoclaw.git
cd xiaoclaw
idf.py set-target esp32s3
2

Configure Secrets

Set up your WiFi and API keys via menuconfig

idf.py menuconfig
3

Build & Flash

Build the firmware and flash to your device

idf.py build
idf.py -p PORT flash monitor

基于以下优秀项目构建

XiaoClaw combines the best of both worlds

xiaozhi-esp32

语音交互框架 — 语音采集、回放、唤醒词、显示屏、网络通信

View on GitHub

mimiclaw

ESP32 AI Agent — LLM 推理、工具调用、记忆管理、自主任务执行

View on GitHub