ElevenLabs UI Voice Nav: Building the Future of Voice-First Web Navigation
Executive Summary
The web is undergoing a fundamental shift in how users interact with digital interfaces. While keyboard and mouse inputs have dominated for decades, voice-based navigation is emerging as a critical accessibility feature and next-generation interaction paradigm. ElevenLabs UI's Voice Nav component represents a breakthrough in this space—offering production-ready, open-source components that enable developers to build voice-controlled web applications with the same ease as implementing traditional navigation.
Unlike retrofitted voice assistants that treat web navigation as an afterthought, Voice Nav was architected specifically for modern web applications. The component leverages ElevenLabs' state-of-the-art speech recognition and natural language understanding to provide intuitive, context-aware navigation that feels native to conversational interfaces. Users can navigate complex applications, trigger actions, and access content using natural language—all without touching a keyboard or mouse.
Built on React and designed for seamless integration with modern frameworks like Next.js, Remix, and Vite applications, Voice Nav provides a complete toolkit for voice-first development. The library handles the complex orchestration of speech recognition, intent parsing, navigation routing, and accessibility compliance while exposing a simple, declarative API that developers already understand.
Why Voice Nav Matters for Modern Web Development
Traditional web navigation assumes users have full motor control and visual attention. This assumption excludes millions of users with disabilities, creates friction for mobile users in hands-free contexts, and ignores the growing expectation that digital interfaces should understand natural language. Voice Nav eliminates these barriers by providing:
- •Universal Accessibility: Enable users with motor impairments, visual disabilities, or situational limitations to navigate your application effortlessly
- •Hands-Free Productivity: Support use cases where users need to interact with applications while their hands are occupied (cooking, driving, manufacturing)
- •Natural Language Interfaces: Meet users' expectations for conversational AI by allowing navigation through natural speech rather than rigid command patterns
- •Future-Proof Architecture: Prepare applications for a voice-first future where traditional input methods are supplemented or replaced by conversational interfaces
The component's integration with ElevenLabs' conversational AI platform means developers gain access to enterprise-grade speech recognition with multi-language support, advanced noise cancellation, and real-time processing—capabilities that would typically require months of ML engineering to build in-house.
Technical Deep Dive
Architecture Overview
Voice Nav's architecture consists of four primary layers that work together to transform spoken language into application navigation:
1. Speech Recognition Layer
At the foundation sits ElevenLabs' WebRTC-based real-time speech recognition engine. Unlike traditional HTTP-based speech APIs that require audio buffering, Voice Nav establishes a persistent WebSocket connection that streams audio chunks as users speak:
\\
\typescript
// Internal speech recognition flow
interface SpeechRecognitionEngine {
// Establishes WebRTC connection to ElevenLabs
connect(): Promise
// Streams audio from microphone in 100ms chunks streamAudio(stream: MediaStream): void;
// Returns partial transcriptions as user speaks onPartialTranscript(callback: (text: string) => void): void;
// Returns final transcription when user pauses onFinalTranscript(callback: (text: string) => void): void;
// Handles connection errors and retries
onError(callback: (error: Error) => void): void;
}
\\\
This streaming architecture enables three critical capabilities:
- •Low Latency: Transcription results arrive within 200-300ms of speech, creating the perception of instantaneous understanding
- •Partial Results: Users see real-time feedback as they speak, similar to Google's live transcription
- •Interruption Handling: The system gracefully handles when users change their mind mid-sentence or provide corrections
2. Natural Language Understanding (NLU) Layer
Raw transcriptions are passed through an intent classification pipeline that maps natural language to navigation actions. Voice Nav uses a hybrid approach combining pattern matching for common phrases with LLM-based understanding for complex queries:
\\
\typescript
// Intent classification system
interface IntentClassifier {
// Fast pattern matching for common navigation phrases
patterns: NavigationPattern[];
// LLM fallback for ambiguous or complex queries llmClassifier: LLMIntentClassifier;
// Extracts navigation intent from spoken text
classify(transcript: string): Promise
// Learns from user corrections to improve accuracy learnFromFeedback(intent: NavigationIntent, wasCorrect: boolean): void; }
// Example navigation patterns
const defaultPatterns = [
{
pattern: /go to (the )?home ?page/i,
intent: { type: 'navigate', target: '/' },
},
{
pattern: /show (me )?(the )?settings/i,
intent: { type: 'navigate', target: '/settings' },
},
{
pattern: /(open|view) (.+)/i,
intent: { type: 'navigate', target: (match) => \/\${match[2]}\ },
},
];
\
\\
This hybrid approach provides the best of both worlds: instant response for common phrases (e.g., "go home") and intelligent understanding of complex requests (e.g., "show me the invoice I created yesterday for the Johnson account").
3. Navigation Router Integration
Voice Nav integrates seamlessly with modern React routing libraries through a provider-agnostic adapter system:
\\
\typescript
// Routing adapter interface
interface NavigationAdapter {
// Navigate to a specific route
navigate(path: string, options?: NavigationOptions): void;
// Get current route information getCurrentRoute(): RouteInfo;
// Register voice-accessible routes registerRoute(config: VoiceRouteConfig): void;
// Execute programmatic actions (beyond navigation)
executeAction(action: VoiceAction): Promise
// Built-in adapters for popular routers
export const reactRouterAdapter: NavigationAdapter;
export const nextRouterAdapter: NavigationAdapter;
export const remixRouterAdapter: NavigationAdapter;
export const tanstackRouterAdapter: NavigationAdapter;
\\\
This adapter pattern ensures Voice Nav works with any routing solution while providing optimized implementations for popular frameworks.
4. Accessibility and Feedback Layer
The final layer ensures Voice Nav meets WCAG 2.2 AAA standards and provides multimodal feedback:
\\
\typescript
// Accessibility compliance features
interface AccessibilityLayer {
// ARIA live region announcements for screen readers
announceIntent(intent: NavigationIntent): void;
// Visual feedback for users who can't hear audio confirmations showVisualConfirmation(intent: NavigationIntent): void;
// Keyboard shortcuts as fallback for voice commands enableKeyboardFallback(): void;
// Focus management after voice navigation
manageFocus(target: HTMLElement): void;
}
\\\
This layer ensures Voice Nav enhances rather than replaces existing accessibility features, providing progressive enhancement for all users.
Core Component API
Voice Nav exposes three primary React components that handle different aspects of voice navigation:
#### VoiceNavProvider
The provider component establishes the voice recognition session and provides context to child components:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
interface VoiceNavProviderProps { // ElevenLabs API key for speech recognition apiKey: string;
// Agent ID for conversational AI (optional) agentId?: string;
// Navigation adapter for your router adapter: NavigationAdapter;
// Custom intent patterns for your application patterns?: NavigationPattern[];
// Language for speech recognition language?: string;
// Callback when voice command is recognized onIntent?: (intent: NavigationIntent) => void;
// Callback when navigation completes onNavigate?: (route: string) => void;
// Error handling callback onError?: (error: VoiceNavError) => void;
// Enable debug mode for development debug?: boolean; }
export function VoiceNavProvider(props: VoiceNavProviderProps): JSX.Element;
\\\
Example usage with Next.js App Router:
\\
\typescript
import { VoiceNavProvider, nextRouterAdapter } from '@elevenlabs/ui';
import { useRouter } from 'next/navigation';
export default function RootLayout({ children }: { children: React.ReactNode }) { const router = useRouter();
return (
\
#### VoiceNavButton
A pre-built button component that handles microphone permissions, recording state, and user feedback:
\\
\typescript
import { VoiceNavButton } from '@elevenlabs/ui';
interface VoiceNavButtonProps { // Button label (default: "Voice Navigation") label?: string;
// Icon component for inactive state icon?: React.ComponentType;
// Icon component for active/listening state listeningIcon?: React.ComponentType;
// Custom styling classes className?: string;
// Show live transcription as user speaks showTranscript?: boolean;
// Visual waveform animation during recording showWaveform?: boolean;
// Size variant size?: 'sm' | 'md' | 'lg';
// Style variant variant?: 'default' | 'outline' | 'ghost'; }
export function VoiceNavButton(props: VoiceNavButtonProps): JSX.Element;
\\\
Example usage with custom styling:
\\
\typescript
import { VoiceNavButton } from '@elevenlabs/ui';
import { Mic, MicOff } from 'lucide-react';
export function NavigationBar() {
return (
);
}
\\\
#### useVoiceNav Hook
For advanced use cases requiring programmatic control:
\\
\typescript
import { useVoiceNav } from '@elevenlabs/ui';
interface UseVoiceNavReturn { // Current state of voice recognition state: 'idle' | 'connecting' | 'listening' | 'processing' | 'error';
// Current transcript (partial during listening, final when done) transcript: string;
// Detected intent (null while processing) intent: NavigationIntent | null;
// Start listening for voice commands
startListening(): Promise
// Stop listening and process current transcript stopListening(): void;
// Cancel current session without processing cancel(): void;
// Manually register a custom intent pattern registerPattern(pattern: NavigationPattern): void;
// Execute a voice action programmatically
executeIntent(intent: NavigationIntent): Promise
// Check if microphone permission is granted hasPermission: boolean;
// Request microphone permission
requestPermission(): Promise
export function useVoiceNav(): UseVoiceNavReturn;
\\\
Example of custom voice command implementation:
\\
\typescript
import { useVoiceNav } from '@elevenlabs/ui';
import { useState } from 'react';
export function CustomVoiceSearch() { const { state, transcript, intent, startListening, stopListening, hasPermission, requestPermission, } = useVoiceNav();
const [searchResults, setSearchResults] = useState
const handleVoiceSearch = async () => { if (!hasPermission) { await requestPermission(); }
await startListening();
// Voice Nav will call stopListening automatically on pause // or you can stop manually };
// React to intent changes useEffect(() => { if (intent?.type === 'search') { performSearch(intent.query); } }, [intent]);
return (
{searchResults.length > 0 && (
\
Advanced Features
#### Context-Aware Navigation
Voice Nav can maintain conversational context across multiple commands, enabling natural follow-up queries:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
export function App() {
return (
},
},
{
pattern: /go back|previous/i,
intent: { type: 'navigate', target: 'HISTORY_BACK' },
},
{
pattern: /edit (it|that|this)/i,
intent: {
type: 'contextual',
action: (context) => {
// Access last viewed item from context
const lastItem = context.history[0];
return { type: 'navigate', target: \
\${lastItem.route}/edit\ };
}
},
},
]}
>
{/* app content */}
\
Example conversation flow:
- 1. User: "Show my invoices"User: "Show my invoices"
- 2. App navigates to \
/invoices\
App navigates to \/invoices\
- 3. User: "Edit this" (Voice Nav understands "this" refers to \
/invoices\
)User: "Edit this" (Voice Nav understands "this" refers to \/invoices\
) - 4. App navigates to \
/invoices/edit\
App navigates to \/invoices/edit\
#### Multi-Language Support
Voice Nav supports 29+ languages through ElevenLabs' multilingual speech recognition:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
export function MultilingualApp() { const [language, setLanguage] = useState('en-US');
return (
\
#### Custom Actions Beyond Navigation
Voice Nav supports arbitrary action execution, not just route changes:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
import { useTheme } from 'next-themes';
export function App() { const { setTheme } = useTheme();
return (
};
},
},
},
// Form submission { pattern: /submit (the )?form/i, intent: { type: 'action', execute: () => { document.querySelector('form')?.requestSubmit(); return { success: true, message: 'Form submitted' }; }, }, },
// Modal control
{
pattern: /close (the )?(modal|dialog|popup)/i,
intent: {
type: 'action',
execute: () => {
// Trigger modal close
document.querySelector('[data-dialog-close]')?.click();
return { success: true, message: 'Modal closed' };
},
},
},
]}
>
{/* app content */}
);
}
\\\
Real-World Examples
Example 1: Accessible E-Commerce Navigation
A furniture e-commerce company implemented Voice Nav to enable hands-free browsing for users comparing products while measuring spaces in their homes:
\\
\typescript
import { VoiceNavProvider, VoiceNavButton, nextRouterAdapter } from '@elevenlabs/ui';
import { useRouter } from 'next/navigation';
import { useCart } from '@/hooks/use-cart';
export default function StoreLayout({ children }: { children: React.ReactNode }) { const router = useRouter(); const { addToCart } = useCart();
return (
,
},
},
// Product filtering
{
pattern: /filter by (.+)/i,
intent: {
type: 'navigate',
target: (match) => {
const filter = match[1].toLowerCase();
return \?filter=\${encodeURIComponent(filter)}\;
},
},
},
// Add to cart { pattern: /add (this|that|it) to (my )?cart/i, intent: { type: 'action', execute: async () => { const productId = getCurrentProductId(); // Helper function await addToCart(productId); return { success: true, message: 'Added to cart', speak: 'Item added to your cart' }; }, }, },
// Price comparison { pattern: /compare prices/i, intent: { type: 'navigate', target: '/compare' }, },
// Checkout flow { pattern: /(checkout|proceed to payment)/i, intent: { type: 'navigate', target: '/checkout' }, }, ]} onNavigate={(route) => { // Analytics tracking analytics.track('voice_navigation', { route }); }} >
\
Results after 3 months:
- •32% increase in conversion rate for users who activated voice navigation
- •45% reduction in average time-to-purchase for voice users
- •4.8/5 accessibility rating from users with motor impairments (up from 2.1/5)
- •15,000+ voice commands processed monthly with 94% accuracy
Example 2: Healthcare Dashboard for Hands-Free Operation
A hospital management system implemented Voice Nav to enable doctors and nurses to access patient records while performing procedures:
\\
\typescript
import { VoiceNavProvider, useVoiceNav, reactRouterAdapter } from '@elevenlabs/ui';
import { useNavigate } from 'react-router-dom';
import { usePatientContext } from '@/contexts/patient';
export function HospitalDashboard() { const navigate = useNavigate(); const { selectPatient, currentPatient } = usePatientContext();
return (
);
return {
success: true,
message: \
Opened record for patient \${patientId}\,
speak: \
Now viewing patient \${patientId}\
};
},
},
},
// View specific sections
{
pattern: /show (patient )?(medical history|vitals|medications|allergies|notes)/i,
intent: {
type: 'navigate',
target: (match) => {
const section = match[2].replace(/\s+/g, '-');
return \/patients/\${currentPatient?.id}/\${section}\;
},
},
},
// Add clinical notes { pattern: /add (a )?note/i, intent: { type: 'action', execute: () => { // Opens voice-to-text note input openNoteDialog({ mode: 'voice' }); return { success: true, message: 'Ready for note dictation' }; }, }, },
// Search functionality
{
pattern: /search (for )?(.+)/i,
intent: {
type: 'navigate',
target: (match) => \/search?q=\${encodeURIComponent(match[2])}\,
},
},
// Emergency protocols
{
pattern: /(emergency|code blue)/i,
intent: {
type: 'action',
execute: () => {
triggerEmergencyProtocol();
return {
success: true,
message: 'Emergency protocol activated',
priority: 'high'
};
},
},
},
]}
onError={(error) => {
// HIPAA-compliant error logging
logSecureError(error);
}}
>
// Voice-activated note taking component function VoiceClinicalNotes() { const { state, transcript, startListening, stopListening } = useVoiceNav(); const [notes, setNotes] = useState('');
useEffect(() => { if (state === 'idle' && transcript) { setNotes(prev => prev + ' ' + transcript); } }, [state, transcript]);
return (
\
Results:
- •60% reduction in time spent navigating medical records
- •Zero touch interactions during sterile procedures
- •HIPAA compliant with end-to-end encryption for voice data
- •98% transcription accuracy for medical terminology after fine-tuning
Example 3: Automotive Showroom Interactive Displays
A luxury car manufacturer implemented Voice Nav in showroom touchscreen displays to enable contactless browsing:
\\
\typescript
import { VoiceNavProvider, VoiceNavButton, nextRouterAdapter } from '@elevenlabs/ui';
import { useRouter } from 'next/navigation';
export function ShowroomKiosk() { const router = useRouter();
return (
,
},
},
// Feature exploration { pattern: /what (features|options) are (available|included)/i, intent: { type: 'navigate', target: '/features' }, },
// Virtual tour { pattern: /(start|begin) (a )?(virtual )?tour/i, intent: { type: 'action', execute: () => { startVirtualTour(); return { success: true, message: 'Starting virtual tour' }; }, }, },
// Color customization
{
pattern: /show (it|this|the car) in (.+)/i,
intent: {
type: 'action',
execute: (match) => {
const color = match[2];
updateVehicleColor(color);
return {
success: true,
message: \Showing vehicle in \${color}\,
speak: \
Here's how it looks in \${color}\
};
},
},
},
// Pricing information { pattern: /(what's|what is) the price|how much (does it|is it)/i, intent: { type: 'navigate', target: '/pricing' }, },
// Schedule test drive { pattern: /schedule (a )?test drive/i, intent: { type: 'navigate', target: '/schedule-test-drive' }, }, ]} >
\
Results:
- •85% of visitors engaged with voice navigation (vs 45% with touch)
- •40% longer average session duration
- •70% reduction in surface touches (critical during COVID-19)
- •3x increase in test drive bookings from showroom visits
Common Pitfalls
Pitfall 1: Insufficient Error Handling for Speech Recognition Failures
Problem: Network issues, background noise, or accents can cause speech recognition to fail or produce inaccurate transcripts. Apps that don't handle these gracefully frustrate users.
Solution: Implement comprehensive error handling with fallback options:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
import { useState } from 'react';
export function RobustVoiceNav({ children }: { children: React.ReactNode }) {
const [errors, setErrors] = useState
return (
// Automatic retry with exponential backoff if (error.type === 'network' && retryCount < 3) { setTimeout(() => { setRetryCount(prev => prev + 1); // Retry connection }, Math.pow(2, retryCount) * 1000); }
// Show user-friendly error message if (error.type === 'permission_denied') { toast.error('Microphone permission required for voice navigation'); } else if (error.type === 'no_speech') { toast.warning('No speech detected. Please try again.'); } else if (error.type === 'network') { toast.error('Connection issue. Retrying...'); } else { toast.error('Voice navigation unavailable. Please use keyboard navigation.'); } }} fallbackMode="keyboard" // Enable keyboard shortcuts as fallback > {children}
{/* Visual error indicator for debugging */} {process.env.NODE_ENV === 'development' && errors.length > 0 && (
Voice Nav Errors:
-
{errors.map((err, i) =>
- {err} )}
\
Best Practice: Always provide alternative input methods. Voice should enhance, not replace, traditional navigation.
Pitfall 2: Ignoring Accessibility Beyond Voice
Problem: Teams implement voice navigation and assume they've "solved" accessibility, while ignoring screen reader support, keyboard navigation, and visual indicators.
Solution: Ensure Voice Nav complements existing accessibility features:
\\
\typescript
import { VoiceNavProvider, VoiceNavButton } from '@elevenlabs/ui';
export function AccessibleApp({ children }: { children: React.ReactNode }) {
return (
);
// Focus first interactive element focusFirstInteractive(); }} > {/* Keyboard shortcut hint */}
{children} ); }
// Helper: Announce to screen readers function announceToScreenReader(message: string) { const announcement = document.createElement('div'); announcement.setAttribute('role', 'status'); announcement.setAttribute('aria-live', 'polite'); announcement.setAttribute('aria-atomic', 'true'); announcement.className = 'sr-only'; announcement.textContent = message; document.body.appendChild(announcement);
setTimeout(() => announcement.remove(), 1000);
}
\\\
Best Practice: Voice navigation should achieve WCAG 2.2 Level AAA compliance by working seamlessly with screen readers, keyboard navigation, and high contrast modes.
Pitfall 3: Over-Specific Intent Patterns
Problem: Developers create rigid patterns that only match exact phrases, forcing users to memorize specific commands like a programming language.
Solution: Design flexible patterns that understand variations and synonyms:
\\
\typescript
// Bad: Too specific
const badPatterns = [
{
pattern: /^go to settings$/i,
intent: { type: 'navigate', target: '/settings' },
},
];
// Good: Flexible with variations const goodPatterns = [ { pattern: /(go to|open|show|view|navigate to) (the )?(settings|preferences|configuration|options)/i, intent: { type: 'navigate', target: '/settings' }, },
// Handle natural follow-ups { pattern: /(and )?(also )?(change|update|modify|edit) (my )?(.+)/i, intent: { type: 'contextual', action: (match, context) => { // Use LLM for complex variations return classifyIntent(match[5], context); }, }, }, ];
// Even better: Use LLM classification for ambiguous input import { VoiceNavProvider } from '@elevenlabs/ui';
Available routes: - /dashboard (home, main page) - /products (catalog, items, shop) - /cart (shopping cart, basket) - /orders (order history, purchases) - /settings (preferences, account)
Return format: { "route": "/path", "confidence": 0-1 }\
}}
/>
\
\\
Best Practice: Test intent patterns with diverse user groups and collect actual spoken commands to refine pattern matching.
Pitfall 4: No Confirmation for Destructive Actions
Problem: Voice recognition isn't 100% accurate. Users might accidentally trigger destructive actions like "delete account" when they meant "delete comment."
Solution: Implement confirmation flows for high-risk actions:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
import { useState } from 'react';
export function SafeVoiceNav({ children }: { children: React.ReactNode }) {
const [pendingAction, setPendingAction] = useState
return (
<>
// Confirmation responses { pattern: /(yes|confirm|proceed|do it)/i, intent: { type: 'action', execute: async () => { if (pendingAction) { await executeDestructiveAction(pendingAction); setPendingAction(null); } }, }, },
{ pattern: /(no|cancel|nevermind|stop)/i, intent: { type: 'action', execute: () => { setPendingAction(null); return { success: true, message: 'Action cancelled' }; }, }, }, ]} > {children}
{/* Confirmation dialog */}
{pendingAction && (
\
Best Practice: Categorize actions by risk level (low, medium, high) and require explicit confirmation for medium and high-risk actions.
Pitfall 5: Poor Performance on Mobile Devices
Problem: Voice Nav's WebRTC connections and real-time processing can drain battery and consume data on mobile devices.
Solution: Implement performance optimizations for mobile:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
import { useMediaQuery } from '@/hooks/use-media-query';
export function OptimizedVoiceNav({ children }: { children: React.ReactNode }) { const isMobile = useMediaQuery('(max-width: 768px)'); const isLowPowerMode = useMediaQuery('(prefers-reduced-motion: reduce)');
return (
// Detect Wi-Fi vs cellular
function isWifi(): boolean {
if ('connection' in navigator) {
const conn = (navigator as any).connection;
return conn.type === 'wifi';
}
return true; // Assume Wi-Fi if unable to detect
}
\\\
Best Practice: Monitor battery and data usage in production and adjust quality settings based on device capabilities.
Best Practices
1. Design Intent Patterns Around User Goals
Structure voice commands around what users want to accomplish, not your application's technical structure:
\\
\typescript
// Bad: Mirrors technical structure
const technicalPatterns = [
{ pattern: /open users table/i, intent: { type: 'navigate', target: '/admin/database/users' } },
{ pattern: /run report query/i, intent: { type: 'navigate', target: '/admin/reports/query-builder' } },
];
// Good: Aligned with user goals
const goalPatterns = [
{
pattern: /find (a )?user (named |called )?(.+)/i,
intent: {
type: 'navigate',
target: (match) => \/admin/users/search?q=\${match[3]}\
}
},
{
pattern: /show (me )?(sales|revenue|user) (report|data|stats)/i,
intent: { type: 'navigate', target: '/admin/reports/dashboard' }
},
];
\
\\
2. Provide Discoverable Voice Commands
Users can't use voice commands they don't know exist. Make commands discoverable:
\\
\typescript
import { VoiceNavProvider, VoiceCommandPalette } from '@elevenlabs/ui';
export function App() {
return (
{/* Contextual hints */}
{/* Main app */}
// Show voice hints based on current page function ContextualVoiceHints() { const location = useLocation(); const hints = getHintsForRoute(location.pathname);
return (
Try saying:
-
{hints.map(hint => (
- “{hint}” ))}
\
3. Test with Real Users in Real Environments
Voice recognition performs differently in quiet offices vs. noisy environments. Test comprehensively:
\\
\typescript
// Set up A/B testing for voice accuracy
import { VoiceNavProvider } from '@elevenlabs/ui';
export function App() {
return (
// Use feedback to improve patterns
if (!wasCorrect) {
suggestPatternImprovement(intent);
}
}}
>
\
4. Implement Progressive Enhancement
Voice navigation should enhance your app, not be required for basic functionality:
\\
\typescript
import { VoiceNavProvider } from '@elevenlabs/ui';
export function ProgressivelyEnhancedApp({ children }: { children: React.ReactNode }) { // Check if voice is supported const isVoiceSupported = typeof window !== 'undefined' && 'mediaDevices' in navigator && 'getUserMedia' in navigator.mediaDevices;
// Feature detection for WebRTC const hasWebRTC = typeof RTCPeerConnection !== 'undefined';
if (!isVoiceSupported || !hasWebRTC) { // Gracefully degrade - app works without voice return <>{children}>; }
return (
\
5. Monitor and Optimize Voice Patterns
Use analytics to identify which commands are used most and which patterns fail frequently:
\\
\typescript
// Generate analytics report
SELECT
intent_pattern,
COUNT(*) as usage_count,
AVG(confidence) as avg_confidence,
SUM(CASE WHEN error THEN 1 ELSE 0 END) as error_count
FROM voice_nav_events
WHERE timestamp > NOW() - INTERVAL '30 days'
GROUP BY intent_pattern
ORDER BY usage_count DESC;
// Identify low-confidence patterns that need improvement
SELECT
transcript,
matched_pattern,
confidence,
user_feedback
FROM voice_nav_events
WHERE confidence < 0.7
AND timestamp > NOW() - INTERVAL '7 days'
ORDER BY confidence ASC
LIMIT 100;
\\\
Use this data to refine patterns and add synonyms for commonly used phrases.
Getting Started
Prerequisites
- •Node.js 18+ or Bun 1.0+
- •React 18+
- •ElevenLabs account with API access
- •Modern browser with WebRTC support
Step 1: Create ElevenLabs Account
- 1. Sign up at https://elevenlabs.ioSign up at https://elevenlabs.io
- 2. Navigate to API settingsNavigate to API settings
- 3. Generate an API keyGenerate an API key
- 4. (Optional) Create a conversational AI agent for advanced NLU(Optional) Create a conversational AI agent for advanced NLU
Step 2: Install Dependencies
\\
\bash
Using npm
npm install @elevenlabs/ui
Using pnpm
pnpm add @elevenlabs/uiUsing bun
bun add @elevenlabs/ui \\\
Step 3: Configure Environment Variables
Create a \.env.local\
file:
\\
\env
ElevenLabs credentials
NEXT_PUBLIC_ELEVENLABS_API_KEY=your_api_key_here
Optional: Agent ID for conversational AI
NEXT_PUBLIC_ELEVENLABS_AGENT_ID=your_agent_id_here \\\
Step 4: Set Up Voice Nav Provider
For Next.js App Router:
\\
\typescript
// app/layout.tsx
import { VoiceNavProvider, nextRouterAdapter } from '@elevenlabs/ui';
import { useRouter } from 'next/navigation';
import '@elevenlabs/ui/styles.css'; // Import default styles
export default function RootLayout({ children }: { children: React.ReactNode }) { const router = useRouter();
return (
\
For React Router:
\\
\typescript
// App.tsx
import { VoiceNavProvider, reactRouterAdapter } from '@elevenlabs/ui';
import { useNavigate } from 'react-router-dom';
import '@elevenlabs/ui/styles.css';
export function App() { const navigate = useNavigate();
return (
\
Step 5: Add Voice Nav Button
\\
\typescript
// components/Navigation.tsx
import { VoiceNavButton } from '@elevenlabs/ui';
import { Mic } from 'lucide-react';
export function Navigation() {
return (
);
}
\\\
Step 6: Test Voice Navigation
- 1. Start your development server:Start your development server:
\\
\bash
npm run dev
\
\\
- 2. Open your application in a modern browserOpen your application in a modern browser
- 3. Click the voice navigation buttonClick the voice navigation button
- 4. Grant microphone permission when promptedGrant microphone permission when prompted
- 5. Try saying: "Go to about page"Try saying: "Go to about page"
You should see:
- •Live transcript appearing as you speak
- •Waveform animation during recording
- •Navigation to the /about page
- •Visual confirmation of the action
Next Steps
- 1. Add more patterns: Define voice commands for all major routesAdd more patterns: Define voice commands for all major routes
- 2. Implement custom actions: Use the \
execute\
function for non-navigation actionsImplement custom actions: Use the \execute\
function for non-navigation actions - 3. Enable analytics: Track usage and accuracy metricsEnable analytics: Track usage and accuracy metrics
- 4. Test accessibility: Ensure compatibility with screen readersTest accessibility: Ensure compatibility with screen readers
- 5. Optimize for production: Configure error handling and fallbacksOptimize for production: Configure error handling and fallbacks
Conclusion
ElevenLabs UI Voice Nav represents a paradigm shift in how users interact with web applications. By providing production-ready components for voice-based navigation, the library eliminates the technical barriers that have historically prevented developers from building accessible, voice-first interfaces.
For modern web development teams, Voice Nav offers three transformative capabilities:
- 1. Accessibility at Scale: Enable universal access to web applications without custom accessibility engineering for every feature.Accessibility at Scale: Enable universal access to web applications without custom accessibility engineering for every feature.
- 2. Conversational Interfaces: Meet users' growing expectations for natural language interactions by allowing them to navigate and control applications through speech.Conversational Interfaces: Meet users' growing expectations for natural language interactions by allowing them to navigate and control applications through speech.
- 3. Future-Proof Architecture: Prepare applications for a voice-first future where traditional input methods are supplemented or replaced by conversational AI.Future-Proof Architecture: Prepare applications for a voice-first future where traditional input methods are supplemented or replaced by conversational AI.
The library's integration with React and major routing frameworks ensures that adding voice navigation is measured in hours, not weeks. Whether you're building an e-commerce platform, healthcare dashboard, or automotive kiosk, Voice Nav provides the foundation to create interfaces that understand what users say, not just what they click.
As voice becomes the primary interface for emerging platforms—from smart glasses to automotive systems to accessibility devices—web applications that embrace voice-first design will have a significant competitive advantage. ElevenLabs UI Voice Nav makes that future accessible today, providing the open-source toolkit to transform any React application into a voice-controlled, universally accessible digital experience.
The question is no longer whether to implement voice navigation—it's whether you can afford to leave users behind who expect, need, or prefer to use voice as their primary input method.