- Add webrtcvad dependency for real-time voice activity detection - Create audio/fade.py with fade-in/fade-out utility - Add VAD voice activation to client recording (sends audio only during speech) - Apply 200ms fade-out to TTS output to avoid abrupt audio cuts - Fix tts.py indentation error in except block
28 lines
263 B
Plaintext
28 lines
263 B
Plaintext
# WebSocket server
|
|
fastapi
|
|
uvicorn[standard]
|
|
websockets
|
|
webrtcvad
|
|
|
|
# Speech-to-Text
|
|
faster-whisper
|
|
soundfile
|
|
|
|
# LLM
|
|
transformers
|
|
torch
|
|
accelerate
|
|
bitsandbytes
|
|
|
|
# TTS
|
|
torchaudio
|
|
|
|
# Audio processing
|
|
numpy
|
|
scipy
|
|
|
|
# Utilities
|
|
python-dotenv
|
|
pydantic
|
|
pydantic-settings
|