KS5026 Xiao Zhi AI Chatbot Breadboard DIY Kit with 128x64 OLED Screen

Introduction

This is a DIY kit that allows you to quickly build a prototype of the “Xiao Zhi AI Chatbot” on a breadboard using simple hardware and speech recognition technology. It includes key components such as the ESP32-S3-DevKitC-1 development board, MEMS digital microphone (INMP441), digital amplifier (MAX98357A), 128x64 OLED screen, and cavity speaker, which support voice input and playback. There are reserved interfaces for further expansion, enabling basic human-machine interaction functionality.

Features

  • Easy Setup, Quick to Get Started: All components can be plugged into the breadboard without complex soldering skills.

  • Voice Input: Built-in MEMS digital microphone (INMP441) effectively reduces environmental noise interference.

  • Audio Output: The combination of the digital amplifier (MAX98357A) and cavity speaker provides clear speech playback.

  • Expandable: Extra GPIO and I²C interfaces are reserved for easily adding more sensors or functional modules to the robot.

  • Learning-Friendly: Hands-on building and debugging allow users to gain a deeper understanding of the basic principles of AI voice recognition and playback.

Required Hardware

Hardware Name

Specifications/Model

Main Uses

Related Images (Example)

Development Board

ESP32-S3-DevKitC-1 (WROOM N16R8 module)

Main control board, responsible for running firmware, processing voice and network connections

Img

Digital Microphone

INMP441

Audio input collection

Img

Amplifier

MAX98357A

Audio output driver (converts digital signals to analog audio)

Img

Cavity Speaker

8Ω 2~3W or 4Ω 2~3W

Speaker for sound output

Img

Jumper Wires

A box of jumper wires, several Dupont wires

Connection between modules and development board

Img

Breadboards (2 pieces)

400 holes, modular

Convenient for direct connection of various electronic components

Img

LCD Display

128x64 I2C (SSD1306 driver)

Displays WiFi status, dialogue information, and other prompts

Img

Tactile Switch/Button (several)

6×6mm vertical tactile switch

Volume adjustment and other interactive operations

Img

Type-C Data Cable

For firmware flashing

Connects the development board to a PC for firmware uploading and debugging

Img

Hardware

1. ESP32-S3-DevKitC-1 Development Board

Img The ESP32-S3-DevKitC-1 is a high-performance development board based on the Espressif ESP32-S3 series chip, integrating 2.4 GHz Wi-Fi and Bluetooth 5.0/BLE wireless connectivity, and featuring a dual-core Xtensa® LX7 processor with a maximum clock speed of 240 MHz. It supports hardware acceleration for machine learning, allowing efficient inference in AI scenarios such as voice recognition and image processing. Here are its main features and key parameters.

Features

  • High-Performance Processing

    • Xtensa® LX7 dual-core architecture, with a maximum clock speed of 240 MHz

    • Supports machine learning hardware acceleration to improve AI application inference speed

  • Wireless Connectivity

    • Integrated 2.4 GHz 802.11 b/g/n Wi-Fi

    • Supports Bluetooth 5.0 and Bluetooth Low Energy (BLE), facilitating short-range communication in various scenarios

  • Multiple Interfaces

    • Up to 38 GPIO pins

    • Built-in various communication protocols: ADC (12-bit), DAC, PWM, I2C, SPI, UART, etc.

    • Onboard LED indicator, along with Boot, Reset, and other functional buttons

  • Security Features

    • Hardware encryption engine (AES, SHA, RSA, etc.)

    • Supports Secure Boot and Flash Encryption, providing multiple levels of protection from hardware to firmware

  • Low Power Design

    • Standby power as low as approximately 10 μA, optimized power consumption during Wi-Fi operation

    • Suitable for battery-powered projects or projects requiring long battery life

  • Development Ecosystem

    • Supports Arduino IDE for quick onboarding

    • Compatible with ESP-IDF, PlatformIO, and other advanced development frameworks, facilitating custom projects

2. MEMS Digital Microphone (INMP441)

Img

The INMP441 is a MEMS-based digital microphone with built-in amplification, analog-to-digital conversion, and I²S output. Compared to traditional analog microphones, it effectively reduces noise interference, making it suitable for applications in voice recognition and interaction.

Features

  1. I²S Digital Output: Outputs digital audio directly, avoiding interference from analog cables.

  2. Compact Size, Easy Integration: Suitable for projects with space constraints.

  3. Low Power Consumption: Ideal for battery-powered scenarios.

  4. High Sensitivity: Capable of capturing faint sounds, making it well-suited for voice recognition.

  5. Built-in Voltage Regulation and Clock: Reduces external circuit requirements, simplifying soldering.

  6. Soldering Difficulty: It is recommended to use a pre-soldered module to reduce difficulty.

Parameters

Parameter

Value / Range

Description

Operating Voltage

3.3V (typical)

Recommended range 1.8V ~ 3.3V

Output Interface

I²S

Left aligned, single-channel output

Signal-to-Noise Ratio

~61 dB

Higher SNR ensures better sound quality

Sensitivity

-26 dBFS (typical)

Measured under 94 dB SPL, 1kHz input conditions

Frequency Response Range

60 Hz ~ 15 kHz (typical)

Meets most human voice capture needs

Current Consumption

1.1 mA ~ 1.7 mA

Typical operating current

Package Size

3.76 mm × 2.95 mm

Requires fine soldering technique

3. Digital Amplifier (MAX98357A)

Img

The MAX98357A is a highly integrated Class D audio amplifier chip that can directly amplify digital audio via I²S input. It eliminates the need for traditional DACs required in amplifiers, resulting in higher efficiency and smaller size, widely used in portable speakers, smart speakers, and other products.

Features

  1. I²S Digital Input: No additional DAC required, simplifying design.

  2. High Efficiency Class D: Over 90%, suitable for battery-powered scenarios.

  3. Built-in Filtering/PLL: Adapts to various sampling rates for stable and reliable output.

  4. Simplified Peripheral Circuits: Requires only minimal capacitors and resistors to operate.

  5. Protection Mechanisms: Includes overcurrent and overheating protection, making it safer to use.

  6. Drives Various Speakers: Can power 4Ω/8Ω speakers, suitable for low-power audio applications.

Parameters

Parameter

Value / Range

Description

Operating Voltage

2.5V ~ 5.5V

Commonly 3.3V or 5V

Output Power

3W@4Ω / 2W@8Ω

Depends on voltage and heatsinking conditions

Efficiency

Over 90%

Effectively reduces energy loss

Sampling Rate

8kHz ~ 96kHz

Built-in PLL supports various formats

Total Harmonic Distortion + Noise (THD+N)

< 0.03% @1W, 5V

Ensures good sound quality

Protection Features

Overheat / Overcurrent / Short-circuit

Increases safety at use

Note: It is recommended to leave sufficient heat dissipation space and correctly match the speaker impedance, as well as set the gain properly to avoid distortion or chip damage.

4. Cavity Speaker (8Ω 2W)

Img

This type of speaker works in a closed or semi-closed cavity, optimizing low frequencies and concentrating sound energy. It is commonly found in portable speakers and smart voice devices.

Features

  1. Impedance and Power Matching: 8Ω, 2W power, suitable for small amplifiers.

  2. Enhanced Low Frequency: Cavity design helps enhance low-frequency extension.

  3. Compact and Easy to Install: Often equipped with clips or screw holes for integration.

  4. Wide Range of Usage: Suitable for various environmental volume needs.

Parameters

Parameter

Value / Range

Description

Impedance

General specification, matches small amplifiers

Rated Power

2W

Suitable for everyday volume scenarios

Frequency Response

~200Hz ~ 20kHz

Optimized low-frequency performance

Sensitivity

80~90 dB (@1W/1m)

Higher sensitivity for better efficiency

Installation Method

Screws/Clip/Adhesive

Varies by specific model

Note: It is recommended to use a suitable digital amplifier (e.g., MAX98357A) and calibrate the volume to avoid overload that can cause distortion or damage.

5. Jumper Wires

Img

Boxed jumper wires include various types of DuPont wires (male-to-male, male-to-female, female-to-female) organized in small compartments based on length and color. They are suitable for rapid prototyping and connection in breadboard or circuit projects.

Features

  1. Various wire lengths and interface types to accommodate different wiring needs.

  2. Colorful designs make circuit paths easy to distinguish.

  3. Boxed design for convenient storage and portability.

7. 128x64 OLED Display (IIC Interface)

Img

This type of OLED screen often utilizes the SSD1306 driver and communicates via the I²C interface. It offers high contrast, low power consumption, and a small footprint, making it widely used in various microcontroller projects and embedded products. It is recommended to select a newer version of the screen that uses the GND pin as a reference for better stability.

Features

  1. High Contrast: OLED’s self-emissive pixels can display clear text and graphics.

  2. Low Power Consumption: Compared to LCDs of the same size, it consumes less power, making it suitable for battery-powered projects.

  3. SSD1306 Driver: Highly versatile with many open-source libraries available, easy to develop and port.

  4. I²C Communication: Occupies fewer pins with straightforward wiring, ideal for integration into breadboards or small devices.

  5. Compact Size: Suitable for portable or space-constrained project designs.

Parameters

Parameter

Value / Range

Description

Driver Chip

SSD1306

Compatible with various microcontrollers

Communication Interface

I²C

Two lines: SDA (data) + SCL (clock)

Resolution

128×64

Choose based on project needs

Operating Voltage

3.3V ~ 5V depending on module

Usually recommended to supply 3.3V

Power Consumption

μA level standby current, mA during operation

Depends on brightness and refresh rate

Screen Size

0.96 inches

Select size based on requirements

Operating Temperature

-30℃ ~ 70℃

Suitable for most common environments

8. Tactile Switch/Button

Img

The tactile switch (6×6 mm) is generally used in testing, control, and human-machine interaction scenarios. It is compact and triggers easily with a light touch.

Features

  1. Compact Size: Easy to embed in various devices or breadboards.

  2. Tactile Design: Good tactile response with a clear trigger.

  3. Easy to Install: Four-pin design allows easy insertion without the hassle of soldering.

Parameters

  • Dimensions: 6×6 mm (typical)

  • Number of Pins: 4 pins, interconnected in the same direction

  • Rated Current: Approximately 50 mA (varies by model)

  • Operating Temperature: -25°C ~ 85°C (varies slightly by brand)

Tip: To prevent shorting the pins, beginners may want to opt for through-hole buttons for simpler connections, reducing the risk of misoperation.

Wiring Instructions

Pin Connections for ESP32S3 Development Board and Modules

1. Wiring between ESP32S3 Development Board and Microphone

ESP32S3 Development Board

Microphone INMP441 (I2S Interface)

GPIO 4

WS Data Select

GPIO 5

SCK Data Clock

GPIO 6

SD Data Output

3V3

VDD Power Positive 3.3V

GND

GND Ground Short Connect L/R Left Right Channel


2. Wiring between ESP32S3 Development Board and Digital Amplifier

ESP32S3 Development Board

Digital Amplifier MAX98357A

GPIO 7

DIN Digital Signal

GPIO 15

BCLK Bit Clock

GPIO 16

LRC Left Right Clock

3V3 / 3.3V

Vin (or VCC) Power Input Short Connect SD Shutdown Channel

GND

GND Ground Short Connect GAIN Gain and Channel (do not connect on BGA packaged microphone)

Audio+ Connect to Speaker Positive (usually red wire, test with a multimeter if necessary)

Audio- Connect to Speaker Negative


3. Wiring between ESP32S3 Development Board and Display

ESP32S3 Development Board

Display (IIC/I2C Interface, optional)

GPIO 41

SDA Data Line

GPIO 42

SCK Clock Line

3V3 3.3V

VCC Power Positive

GND

GND Ground


4. Wiring between ESP32S3 Development Board and Buttons

The following table shows reference connections for added volume control and Boot/wake buttons.
Please note that the same-direction pins of the four-pin switch are interconnected. If using a breadboard, avoid inserting all four pins in the same row.

ESP32S3 Development Board

Button Function

GPIO 39

Connect to “Volume Decrease” button (other end connects to GND), short press reduces volume, long press mutes

GPIO 40

Connect to “Volume Increase” button (other end connects to GND), short press increases volume, long press for maximum volume

GPIO 0

Can connect to “Wake/Interrupt” button (other end connects to GND), pressing can interrupt/recover conversation

Tip:

  • When soldering or inserting buttons, avoid short-circuiting the pins in the same row; otherwise, it will appear as if the button is constantly pressed.

Wiring Steps Diagram for ESP32S3 Development Board and Each Module

First, a complete picture

Img

The first step is to clip the breadboards together, which is simple to do.

The second step shows that the breadboard has 6 protrusions on top.

Start connecting the ESP32 development board. Align the board starting from the left side A1 and press against the bottom hole.

Img

The third step begins the wiring, paying attention to align with the pins. If unsure, refer to the numbers. Wiring for the round INMP441:

Img

Then insert the INMP441 as shown.

The fourth step is the 0.96-inch OLED screen.

Img

The fifth step shows MAX98357 connections:

Img

After connecting the wires, insert the amplifier: the three orange wires should align with the amplifier’s LRC/BCLK/DIN.

The sixth step shows button connections:

Img

Final completion image.

Img

Now you can proceed to the next step, which is network configuration.

Common Wiring Issues FAQ

  1. After flashing the firmware, the RGB light does not turn on

    • Please check whether the solder joints around the RGB light are properly soldered. If there are any unsoldered places, you can first use wires to connect the corresponding pads and check if the light functions normally after restarting.

  2. How to check for circuit faults?

    • When not powered: You can use a multimeter to measure continuity between wires and GND or 3.3V pins, checking for short or open circuits.

    • When powered: Measure the voltage values between GND and other pins to see if they are within normal ranges (e.g., 5V, 3.3V); if abnormal, further check the corresponding module and connections.

  3. Why should the four-pin button be staggered in the breadboard?

    • Among the four-pin buttons, the two same-direction pins are interconnected; if inserted in the same row on the breadboard, it will cause a short circuit, making the button unable to work properly. It is essential to split the four pins into two rows to ensure the circuit closes only when pressed.

  4. Can I²C (SDA/SCL) and I²S (BCLK/LRCLK/DIN, etc.) pins be shared?

    • It is not recommended. The hardware signal formats, timing, and protocols of I²C and I²S are incompatible, requiring the use of separate corresponding GPIO pins for each.

  5. Why do the volume control buttons have no effect or always seem to be muted?

    • Please ensure that they are connected to the correct GPIO (e.g., 39 and 40) and that the “volume up / volume down” button pins are not reversed. If the hardware is correct, double-check the firmware version and sample code configuration for compatibility.

  6. What to do when contact problems frequently occur when using the breadboard?

    • This may be due to aging breadboard sockets or oxidation of component pins. You can try replacing with a new breadboard, cleaning the component pins, or using shorter jumper wires to reduce points of failure.

  7. How to connect the grounds of sensors, power modules, etc.?

    • The ground of external modules should be connected to the GND of the main control board, ensuring they all share the same ground line to avoid noise or signal stability issues.

Tip: If you encounter problems that are difficult to locate, check if the power supply is stable (e.g., 5V or 3.3V) and ensure that the firmware version and sample code correspond to the actual wiring.

Flashing Firmware (Without IDF Development Environment)

This guide is applicable to the ESP32-S3-WROOM-N16R8 version for firmware flashing, using the Flash Download Tool.

One-click download for the flashing tool

Flashing Tool

  • Operating System: As an example, Windows users should use Flash Download Tool 3.9.7 (other newer versions can also work).

  • Obtain the Tool: Download from Espressif’s official website and extract it to any folder without needing installation.

  • Running Method: Enter the extracted directory and double-click flash_download_tool_3.9.7.exe to start.


2. Downloading Firmware

One-click download for flashing firmware

Flashing Firmware

Click to download, then extract.

Copy the .bin File to the Specified Directory

  • It is suggested to place the extracted merged-binary0.96.bin in the bin directory of the Flash Download Tool for easier subsequent operations.

Other Releases can be checked at the bottom of the project page.


3. Flashing Firmware / Downloading to the Development Board

After extracting and entering the flash_download_tool_3.9.7 directory, double-click to run flash_download_tool_3.9.7.exe. The interface appears as follows:

1) Download Settings

  1. Chip Type (ChipType): Select ESP32-S3

  2. Working Mode (WorkMode): Select Develop

  3. Loading Mode (Download Mode): It is recommended to choose UART (if choosing USB, additional settings are required; not covered here).

Connecting and sRGB Explanation:

  • When the development board’s Type-C interface is facing you, the right port is the UART port, and the left port is the USB port; please do not confuse them.

  • If the onboard sRGB light has not been soldered, there may be warning prompts during tool identification (this does not affect flashing), and this can later be resolved by shorting the solder pads (see section 2 at the end of the document).


2) Load Firmware & SPI Download Settings

  1. Input Firmware Path: Click the ... button in the first blank field to select the merged-binary.bin file.

  1. Check Firmware Option: Check the checkbox before the imported .bin file, and enter 0x0 or 0x00 in the address bar to indicate that it will be flashed to the starting address in storage.

  2. COM Port: In the system’s “Device Manager,” expand the serial port section to find the corresponding COM port number, and select the same port in the tool.
    Img

  3. Speed Settings: The default SPI speed is sufficient; you can choose a higher BAUD rate to speed up the flashing process.

  1. Start Flashing: Click START. The progress bar will begin running until a successful FINISH prompt appears. This entire process typically takes a few minutes to over ten minutes, depending on the firmware size and rate settings.


Flashing Complete

After the flashing is complete, press the RST (Restart) button on the development board (shown in the diagram below) to restart the board, which will then enter Wi-Fi configuration mode. The configuration operations are detailed in subsequent instructions.

Img

How to Configure Device Wi-Fi

1. Wi-Fi Network Configuration

1) Start the Device
  • After flashing the firmware, keep the device powered on, and press the RST button (shown in the diagram below) to restart the device, which will enter configuration mode.
    Img

2) Configuration Status

  • sRGB Color Light Blinking Blue: Indicates that the device is in configuration mode.

  • sRGB Color Light Not On: Refer to the second section of this page for details.

  • If the device is not in configuration mode or needs reconfiguration, press and hold the configuration button (connects to GPIO 1), then press the RST button to reset; release RST first, then release the configuration button to re-enter configuration mode.

  • For firmware version ≥0.2.2, if three attempts to connect to the original Wi-Fi fail, the device will automatically return to configuration mode (you can press RST to restart the device when switching networks).

3) Configuration Steps

  1. Connect to “Xiao Zhi” Wi-Fi
    Use your phone or computer to connect to the Wi-Fi emitted by the device (the name usually resembles Xiaozhi-XXXXXX).

Img

  1. Configure Network
    Click on the found Wi-Fi name Xiaozhi-XXXXXX to connect, which will automatically redirect to the configuration page:

Img

  • Select 2.4G Wi-Fi (if using an iPhone hotspot, you need to enable “maximum compatibility”).

  • Enter the password, and click Connect.

  • If connected successfully, the interface will display “Done” and automatically restart after 3 seconds.

If there is no automatic redirection to the configuration page, you can also manually enter http://192.168.4.1 in the browser’s address bar to access the configuration page.


2. About the RGB Color Light on the Device

  1. Connection and Update Status

    • After power on, if the blue light blinks once, the device is connecting to Wi-Fi; if subsequently the green light blinks, it indicates successful connection and the device can be awakened by voice.

    • If the blue light stays on: the device is performing OTA firmware updates, usually completing in under a minute.

    • If the blue light keeps blinking: the device is in configuration mode.

    • During voice awakening, if the blue light turns on, it indicates that it is connecting to the server.

    • A green light indicates that the device is playing audio.

    • A red light indicates that the device is recording audio.

  2. RGB Light Not Turning On

    • If the light switch has not been soldered, it will not affect Wi-Fi configuration but will prevent you from checking the device status.


3. How to Add a Device

  1. Confirm Device Online

    • Once the device connects to the network, it will announce a 6-digit verification code (which can be repeatedly awakened).

  2. Access Control Panel

Once you’ve changed the language, click console to enter the control panel.

Img

  1. Device Management

    • Create an agent, Img

    • Set your agent’s name.
      Img

    • Click “Add New Device”.

Img
Enter the 6-digit Device ID.

Img

Where to obtain the Device ID: After successful firmware upload and network configuration, the device will automatically announce the six-digit code.

  1. Activation Successful

    • The device will automatically activate and display on the “Device Management” page. Click Configure Role to enter the configuration interface.
      Img

    • Configure the assistant’s name and voice. In the Role Introduction section, you can use AI tools to write a description of the character you want.
      Img

    • Configure AI big models, which allow several optional settings. After determining the settings, save them.
      Img

Restart the Xiao Zhi AI Chatbot to start chatting!