Chat Streaming

Enable real-time streaming responses for a more dynamic and engaging chat experience, similar to ChatGPT.

What is Streaming?

Streaming displays bot responses word-by-word or chunk-by-chunk as they're generated, rather than waiting for the complete response. This creates a more interactive, real-time feel.

Why Use Streaming?

✓ Benefits

• Perceived faster responses
• More engaging user experience
• Users can start reading immediately
• Modern, ChatGPT-like feel
• Better for long responses

⚠ Considerations

• Requires SSE or WebSocket support
• More complex backend setup
• May use more bandwidth
• Not suitable for all use cases

Enable Streaming

Basic Configuration

Enable streaming in your widget configuration:

Streaming Methods

Server-Sent Events (SSE)

One-way communication from server to client. Simpler to implement, works over HTTP.

Best For:

• Most chat applications
• Standard HTTP infrastructure
• Simpler backend requirements

WebSocket

Full-duplex communication. More complex but supports bidirectional streaming.

Best For:

• Real-time collaborative features
• Live updates from server
• Complex bidirectional flows

Backend Implementation

SSE Response Format

Your backend should send data in SSE format:

OpenAI Integration

Example with OpenAI's streaming API:

n8n Streaming

Configure n8n workflows to support streaming:

Use n8n's "Respond to Webhook" node with streaming enabled. Set response mode to "Stream" and configure chunk size for optimal performance.

Streaming Behavior

Display Speed

Control how fast streamed content appears:

Balance Speed and Readability

Too fast can be hard to follow, too slow feels sluggish. 40-60ms per chunk works well for most cases.

Chunk Size

Determine how much text is sent in each chunk:

Character-by-character: Smoothest but more network overhead
Word-by-word: Good balance (recommended)
Sentence-by-sentence: Reduced overhead, still feels real-time
Paragraph-by-paragraph: Best for very long responses

Error Handling

Handle streaming interruptions gracefully:

User Controls

Stop Streaming

Allow users to cancel ongoing streams:

Button appears during streaming

Regenerate Response

Let users request a new response:

Performance Optimization

Best practices for smooth streaming:

Debounce Updates: Don't update UI more than 60 times per second
Batch Small Chunks: Group very small chunks to reduce reflows
Use RequestAnimationFrame: Sync updates with browser paint cycles
Virtual Scrolling: For very long responses, only render visible content
Compression: Enable gzip/brotli for SSE connections

Next: Voice Input

Learn how to enable voice-to-text input for hands-free chat interaction.

Voice Input Setup →