Skip to main content
Learn how to build a natural realtime voice experience, ready for production use. Key concepts demonstrated:
  • Speech-to-text (STT) - for understanding speech inputs
  • LLM - for generating the agent text response
  • Text-to-speech (TTS) - for generating agent speech audio
Architecture
  • Backend: Inworld Runtime + Express.js
  • Frontend: Vite + React
  • Communication: WebSocket

Run the Template

Start the Server

  1. Clone the Voice Agent GitHub repo:
    git clone https://github.com/inworld-ai/voice-agent-node
    cd voice-agent-node
    
  2. Navigate to the server directory:
    cd server
    
  3. Copy the .env-sample file to .env:
    cp .env-sample .env
    
  4. Configure your .env file with required API keys:
    .env
    # Required, Inworld Runtime Base64 API key
    INWORLD_API_KEY=<your_api_key_here>
    
    # Required, get your Assembly.AI API key from https://www.assemblyai.com/
    ASSEMBLY_AI_API_KEY=<your_assemblyai_api_key_here>
    
    Get your Assembly.AI API key for speech-to-text functionality.
  5. Install dependencies:
    npm install
    
  6. Start the server:
    npm start
    
    The server will start on port 4000.

Start the Client

  1. Open a new terminal window.
  2. Navigate to the client directory:
    cd client
    
  3. (Optional) Create a .env file to customize client behavior:
    .env
    # Optional: Enable latency reporting in the UI
    VITE_ENABLE_LATENCY_REPORTING=true
    
    # Optional: Server port (default: 4000)
    VITE_APP_PORT=4000
    
  4. Install dependencies:
    npm install
    
  5. Start the client:
    npm start
    
    The client will start on port 3000 (or the next available port if 3000 is in use) and should automatically open in your default browser.

Chat with Your Agent

  1. Configure the agent:
    • Enter the agent system prompt
    • Click “Create Agent”
  2. Start chatting:
    • Voice input: Click the microphone icon to unmute yourself, speak, then click again to mute
    • Text input: Type in the input field and press Enter to send
  3. Monitor performance:

Next steps