ExpressBuddy - AI-Powered Communication Tool

ExpressBuddy is an innovative AI-powered communication tool designed to help children with autism and speech challenges engage in natural conversations through multimodal interactions. Built with Google's Gemini Live API, ExpressBuddy features real-time voice conversations, emotion detection learning games, interactive avatars with lip-sync capabilities, and intelligent silence detection to maintain engagement.

Features Overview

🗣️ Real-Time Voice Chat with AI Avatar (The panda avatar or any other assets don't belong to us and are part of mascotbot so we can't use it)

Live Conversations: Natural voice conversations powered by Google Gemini Live API
Animated Avatar: Rive-based avatar with real-time lip-sync using ParakeetTDTV2-ASR backend
Memory System: Persistent child profiles that remember preferences and progress across sessions
Silence Detection: Intelligent monitoring with gentle nudges to maintain conversation flow

🎭 Emotion Detective Learning Game

4 Question Types: Comprehensive emotion learning with varied interaction modes
Camera-Based Emotion Mirroring: Real-time facial expression analysis using face-api.js
Progress Tracking: Supabase-powered analytics to monitor learning progress
Interactive Learning: Engaging activities that help children recognize and express emotions

🤖 Avatar Integration

Lip-Sync Technology: Real-time viseme generation for natural speech animation
ParakeetTDTV2-ASR Backend: FastAPI-powered audio processing for accurate mouth movements
Multiple Avatar Options: Various character designs to match child preferences
Subtitle Support: Text synchronization for enhanced comprehension

Getting Started

Prerequisites

Node.js (v14 or higher)
npm or yarn
Google Gemini API key
Supabase account (for progress tracking)
ParakeetTDTV2-ASR backend (for lip-sync functionality)

Installation

Clone the repository

git clone https://github.com/your-username/expresbuddy.git

cd expresbuddy

Install dependencies

npm install

Environment Setup

Copy .env.example to .env and fill in your credentials:

# Google Gemini API

REACT_APP_GEMINI_API_KEY=your_gemini_api_key_here


# Supabase Configuration

REACT_APP_SUPABASE_URL=your_supabase_url

REACT_APP_SUPABASE_ANON_KEY=your_supabase_anon_key


# Kinde Authentication

REACT_APP_KINDE_DOMAIN=your_kinde_domain

REACT_APP_KINDE_CLIENT_ID=your_kinde_client_id

REACT_APP_KINDE_REDIRECT_URI=http://localhost:3000

REACT_APP_KINDE_LOGOUT_URI=http://localhost:3000


# ParakeetTDTV2-ASR Backend

REACT_APP_PARAKEET_ASR_URL=ws://localhost:8000/stream-audio

Launch ExpressBuddy

npm start

The application will open at http://localhost:3000

Application Routes

Main Interface Routes

/ - Landing page with authentication
/chat - Main conversation interface with avatar
/emotion-detective - Emotion learning game hub
/profile - User profile and progress tracking

Emotion Detective Game Routes

/emotion-detective/question-type-1 - Basic emotion identification
/emotion-detective/question-type-2 - Emotion expression matching
/emotion-detective/question-type-3 - Scenario-based emotion recognition
/emotion-detective/question-type-4 - Advanced emotion mirroring with camera
/emotion-detective/emotion-mirroring - Real-time facial expression practice

Development Routes

/demo-tts - TTS functionality demonstration
/test-tts - TTS integration testing
/test-visemes - Viseme generation testing

Core Features Deep Dive

Memory System

ExpressBuddy implements a sophisticated memory system that creates persistent child profiles:

// Memory system automatically stores:

// - Child's name and preferences

// - Conversation history highlights

// - Emotion learning progress

// - Favorite topics and interests


// Usage in components:

const { memory, updateMemory } = useMemory();

const childProfile = memory.getChildProfile();

Key Features:

Persistent storage using localStorage
Automatic context building for conversations
Progress tracking across sessions
Personalized interaction patterns

Silence Detection System

The silence detection system monitors conversation flow and provides gentle engagement nudges:

// Configure silence detection

const silenceConfig = {
   enabled: true,
   silenceThreshold: 3000, // 3 seconds
   nudgeMessages: [
     "I'm here when you're ready to chat!",
     "Take your time, I'm listening.",
     "What's on your mind?"
   ]

};

Features:

Configurable silence thresholds
Multiple nudge message types
Analytics for engagement patterns
Visual indicators for conversation state

Emotion Detective Learning Game

A comprehensive emotion learning system with four distinct question types:

Question Type 1: Basic Emotion Identification

Visual emotion cards with multiple choice
Audio pronunciation of emotion names
Progress tracking and scoring

Question Type 2: Expression Matching

Match facial expressions to emotion words
Interactive drag-and-drop interface
Immediate feedback and explanations

Question Type 3: Scenario-Based Recognition

Real-world situation analysis
Context-based emotion understanding
Story-driven learning experiences

Question Type 4: Camera-Based Emotion Mirroring

Real-time facial expression detection using face-api.js
Live feedback on emotion expression accuracy
Practice mode for emotion expression

// Emotion detection integration

import { EmotionDetector } from '../utils/emotionDetection';


const detector = new EmotionDetector();

detector.startDetection(videoElement, {
   onEmotionDetected: (emotions) =>gt; {
     // Handle detected emotions
     updateEmotionFeedback(emotions);
   }

});

Avatar Integration with Lip-Sync

ExpressBuddy features advanced avatar integration with real-time lip-sync capabilities:

ParakeetTDTV2-ASR Backend Integration

The lip-sync system uses a FastAPI backend for audio-to-viseme conversion:

// Viseme service integration

class VisemeTranscriptionService {
   private websocket: WebSocket;
  
    async sendAudioChunk(audioData: Uint8Array): Promisevoid>gt; {
     // Send audio to ParakeetTDTV2-ASR backend
     this.websocket.send(audioData);
   }
  
    onVisemeReceived(callback: (visemes: VisemeData[]) =>gt; void): void {
     // Handle real-time viseme data
     this.websocket.onmessage = (event) =>gt; {
       const visemes = JSON.parse(event.data);
       callback(visemes);
     };
   }

}

Rive Avatar System

The avatar system uses Rive animations with real-time viseme control:

// Avatar component with lip-sync

const RealtimeExpressBuddyAvatar = () =>gt; {
   const { currentVisemes, currentSubtitles } = useLiveAPIContext();
  
    useEffect(() =>gt; {
     // Update avatar mouth movements based on visemes
     if (riveInstance &∓& currentVisemes.length >gt; 0) {
       visemeController.playVisemes(currentVisemes);
     }
   }, [currentVisemes]);
  
    return (
     RiveCanvas 
       src="/avatars/realistic_female_v1_3.riv"
       onLoad={handleRiveLoad}
     />gt;
   );

};

); };ad} /> ); }; /> ); };" tabindex="0" role="button">

Google Gemini Live API Integration

ExpressBuddy leverages Google's Gemini Live API for natural conversation:

// Live API configuration for ExpressBuddy

const liveAPIConfig = {
   model: "models/gemini-2.0-flash-exp",
   systemInstruction: {
     parts: [{
       text: `You are ExpressBuddy, a helpful AI companion designed to support children with autism and speech challenges. Be patient, encouraging, and adapt your communication style to each child's needs.`
     }]
   },
   generationConfig: {
     responseModalities: ["AUDIO", "TEXT"],
     speechConfig: {
       voiceConfig: { prebuiltVoiceConfig: { voiceName: "Aoede" } }
     }
   }

};

Architecture & Developmentment

ExpressBuddy is built with a modern React/TypeScript stack, integrating multiple AI and multimedia technologies:

Tech Stack

Frontend:

React 18 with TypeScript
Tailwind CSS for styling
shadcn/ui component library
Vite for build tooling

AI & ML Services:ces:

Google Gemini Live API (multimodal conversations)
face-api.js (emotion detection)
ParakeetTDTV2-ASR (audio-to-viseme conversion)

Backend Services:

Supabase (database, authentication, real-time subscriptions)
Kinde (authentication provider)
FastAPI backend for lip-sync processing

Animation & Media:dia:

Rive (avatar animations)
WebRTC (camera/microphone access)
Web Audio API (audio processing)

Project Structure

src/
├── components/
│   ├── avatar/                 # Avatar and lip-sync components
│   ├── emotion-detective/      # Learning game components
│   ├── ui/                     # shadcn/ui components
│   └── layout/                 # Layout and navigation
├── contexts/                   # React contexts (LiveAPI, Memory)
├── hooks/                      # Custom hooks (useLiveAPI, useSilenceDetection)
├── lib/                        # Utility libraries
├── services/                   # API services and data access
├── types/                      # TypeScript type definitions
└── utils/                      # Helper functions
 functions

Key Components

LiveAPIProvider Context

Central state management for Gemini Live API integration:

// Provides real-time conversation state

const LiveAPIProvider = ({ children }) =>gt; {
   const liveAPI = useLiveAPI(options);
   return (
     LiveAPIContext.Provider value={liveAPI}>gt;
       {children}
     /LiveAPIContext.Provider>gt;
   );

};

Memory System Hooks

Persistent storage and child profile management:

const useMemory = () =>gt; {
   const [memory, setMemory] = useState(loadFromStorage());
  
    const updateMemory = (updates) =>gt; {
     const newMemory = { ...memory, ...updates };
     setMemory(newMemory);
     saveToStorage(newMemory);
   };
  
    return { memory, updateMemory };

};

Silence Detection Hook

Engagement monitoring and conversation flow management:

const useSilenceDetection = (config) =>gt; {
   const [state, setState] = useState('listening');
   const [analytics, setAnalytics] = useState({});
  
    // Monitors volume and triggers nudges when appropriate
   return { state, config, updateConfig, triggerNudge, analytics };

};

Available Scripts

In the project directory, you can run:

`npm start`

Runs the app in the development mode.
Open http://localhost:3000 to view it in the browser.

The page will reload if you make edits.
You will also see any lint errors in the console..

`npm run build`

Builds the app for production to the build folder.
It correctly bundles React in production mode and optimizes the build for the best performance..

The build is minified and the filenames include the hashes.
Your app is ready to be deployed!!

See the section about deployment for more information.

This is an experiment showcasing the Live API, not an official Google product. We’ll do our best to support and maintain this experiment but your mileage may vary. We encourage open sourcing projects as a way of learning from each other. Please respect our and other creators' rights, including copyright and trademark rights when present, when sharing these works and creating derivative work. If you want more info on Google's policy, you can find that here.

Name	Name	Last commit message	Last commit date
Latest commit sanjayshreeyans Changed System Prompt for better handling in tough situations Oct 4, 2025 173f41d · · Oct 4, 2025 History 55 Commits
.kiro	.kiro	Got working memories	Jul 28, 2025
.vscode	.vscode	Got working emotion detective, currently question type 1 is supported	Jul 21, 2025
migrations	migrations	Got transcription save working	Sep 9, 2025
public	public	24th august	Aug 24, 2025
readme	readme	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
scripts	scripts	Got face api working, finished integration step 2	Jul 17, 2025
src	src	Changed System Prompt for better handling in tough situations	Oct 4, 2025
test	test	Emotion Detective Phase 1 got browser TTS and lipsync working	Jul 17, 2025
.env.example	.env.example	Removed requriments.txt for CI/CD	Sep 14, 2025
.gcloudignore	.gcloudignore	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
.gitignore	.gitignore	🔒 Security: Fix .gitignore to exclude .env files and add .env.example…	Jul 31, 2025
.npmrc	.npmrc	Fixed babeljs2 issue with ci/cd pipeline	Sep 14, 2025
CONTRIBUTING.md	CONTRIBUTING.md	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
Copilot Chat Implementing Lipsync with backend asr.md	Copilot Chat Implementing Lipsync with backend asr.md	Got accurate lipsync working with ASR backend	Jul 6, 2025
EDGE_TTS_INTEGRATION_GUIDE.md	EDGE_TTS_INTEGRATION_GUIDE.md	Emotion Detective Phase 1 got browser TTS and lipsync working	Jul 17, 2025
EXPRESSBUDDY_TOKEN_USAGE_DOCUMENTATION.md	EXPRESSBUDDY_TOKEN_USAGE_DOCUMENTATION.md	24th august	Aug 24, 2025
KINDE_SUPABASE_INTEGRATION.md	KINDE_SUPABASE_INTEGRATION.md	Got Working Auth and Supabase Integration	Aug 1, 2025
LICENSE	LICENSE	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
MEMORY_FEATURE_DOCUMENTATION.md	MEMORY_FEATURE_DOCUMENTATION.md	commit trying to rewrite repo	Jul 31, 2025
MINIMAL_MIGRATION_GUIDE.md	MINIMAL_MIGRATION_GUIDE.md	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
README.md	README.md	Fix Readme	Jul 31, 2025
SCHOOL_DEPLOYMENT_SAFETY_DOCUMENTATION.md	SCHOOL_DEPLOYMENT_SAFETY_DOCUMENTATION.md	Changed System Prompt for better handling in tough situations	Oct 4, 2025
SHADCN_INTEGRATION_GUIDE.md	SHADCN_INTEGRATION_GUIDE.md	WIP: Authentication system integration - unstable state	Jul 12, 2025
SILENCE_DETECTION_GUIDE.md	SILENCE_DETECTION_GUIDE.md	Added Silence Detection	Jul 15, 2025
TAILWIND_MIGRATION.md	TAILWIND_MIGRATION.md	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
TESTING_GUIDE.md	TESTING_GUIDE.md	Added Silence Detection	Jul 15, 2025
TRANSCRIPT_DEBUG_GUIDE.md	TRANSCRIPT_DEBUG_GUIDE.md	Got transcription save working	Sep 9, 2025
TRANSCRIPT_FEATURE_DOCUMENTATION.md	TRANSCRIPT_FEATURE_DOCUMENTATION.md	Got transcription save working	Sep 9, 2025
TRANSCRIPT_IMPLEMENTATION_SUMMARY.md	TRANSCRIPT_IMPLEMENTATION_SUMMARY.md	Got transcription save working	Sep 9, 2025
TTS_INFINITE_RENDER_FIX.md	TTS_INFINITE_RENDER_FIX.md	Got working memories	Jul 28, 2025
app.yaml	app.yaml	🚀 ExpressBuddy Streaming Console - Complete Rewrite	Jun 17, 2025
components.json	components.json	Got tailwind working?!	Jul 11, 2025
craco.config.js	craco.config.js	Fixed faceapi js issue with ci/cd pipeline	Sep 14, 2025
debug-visemes.js	debug-visemes.js	Got working memories	Jul 28, 2025
package-lock.json	package-lock.json	Fixed babeljs issue with ci/cd pipeline	Sep 14, 2025
package.json	package.json	Fixed babeljs issue with ci/cd pipeline	Sep 14, 2025
postcss.config.js	postcss.config.js	Got tailwind working?!	Jul 11, 2025
tableScema.md	tableScema.md	Got transcription save working	Sep 9, 2025
tailwind.config.js	tailwind.config.js	Got tailwind working?!	Jul 11, 2025
tsconfig.json	tsconfig.json	Emotion Detective Phase 1 got browser TTS and lipsync working	Jul 17, 2025

License

sanjayshreeyans/ExpressBuddy

Folders and files

Latest commit

History

Repository files navigation