Chrome Extension

CloudListen

Screen Recording with Real-time Dual Speaker Subtitles

Supports 100+ Languages

English · Chinese · Spanish · French · German · Japanese · Korean · Arabic · and 95+ more

Features

Real-time Transcription, Built for Clarity

🎤

Dual Audio Source Capture

Capture system audio and microphone input simultaneously. CloudListen distinguishes "System" and "Me" as two separate speakers for clear conversation records.

100+ Languages Supported

Powered by Whisper AI, CloudListen transcribes speech in over 100 languages including English, Mandarin, Spanish, French, German, Japanese, Arabic, and more — with automatic language detection.

🔊

Offline Local Transcription

Run whisper-server.py locally on your machine. All transcription happens on-device — no audio data leaves your machine, ensuring complete privacy.

🖥

Chrome Side Panel Display

Real-time subtitles appear directly in Chrome's side panel. Non-intrusive, always accessible, and synchronized with your recording.

🔒

Deepgram & AssemblyAI

Choose between Deepgram Nova-2 or AssemblyAI Universal for cloud transcription. Both offer real-time streaming with automatic connection handling.

📋

Session Management

Auto-saves up to 5 recording sessions. Select multiple sessions and export them as Markdown files for later review.

Multilingual

Transcribe in Any Language

Whisper AI powers CloudListen's local transcription engine. It was trained on 5 million hours of multilingual audio and supports automatic language detection for 100+ languages.

English Chinese (Mandarin / Cantonese) Spanish French German Japanese Korean Arabic Portuguese Russian Hindi Italian Dutch Polish Turkish Vietnamese Thai Indonesian Malay + 82 more

Architecture

Step 1
Screen Capture
getDisplayMedia
Capture dual audio tracks
Step 2
AudioWorklet
Split audio in real-time
Convert to PCM 16bit
Step 3
Whisper / Deepgram
100+ language
transcription model
Step 4
Live Subtitles
Side panel
Dual-speaker display

FAQ

Frequently Asked Questions

How many languages does CloudListen support?

CloudListen supports real-time transcription in over 100 languages via Whisper AI. Includes all major world languages: English, Chinese (Mandarin and Cantonese), Spanish, French, German, Japanese, Korean, Arabic, Portuguese, Russian, Hindi, and many more. CloudListen automatically detects the spoken language.

Can I use CloudListen offline?

Yes. When you run whisper-server.py locally, all transcription happens on your machine. No audio data is sent to external servers. This makes CloudListen suitable for privacy-sensitive environments.

How does dual-speaker transcription work?

CloudListen captures two audio sources simultaneously: your system audio (e.g., a video call, lecture, or presentation) and your microphone input. Each source is transcribed separately and displayed with distinct speaker labels ("System" and "Me") in the Chrome side panel.

What transcription services does CloudListen support?

CloudListen supports Deepgram Nova-2 and AssemblyAI Universal for cloud transcription. Both provide real-time streaming with automatic connection handling. For fully offline operation, use local Whisper AI via whisper-server.py.

Setup Guide

How to Use

1

Choose Transcription Mode

Click the CloudListen extension icon. Choose one of two modes:

  • Cloud API — Deepgram Nova-2 or AssemblyAI. Requires API key, internet required.
  • Local Whisper — Fully offline. No API key needed, all audio processed locally.
2

Start Whisper Server (for Local Whisper mode)

Download and run the Whisper server for fully offline transcription:

Download whisper-server.py Download requirements.txt
pip install faster-whisper flask
python3 whisper-server.py

Server starts at http://localhost:8180. Extension auto-detects it in Local Whisper mode. All audio stays on your machine — complete privacy.

3

Start Recording

Click "Start Recording" in the Chrome side panel. Enable audio sharing in the tab picker. System audio and microphone are captured simultaneously — displayed as "System" and "Me" subtitles in real-time.

4

Export Transcripts

Click "Select to Export" in the side panel. Choose sessions and export as Markdown files for later review.