Frequently Asked Questions

154 answers covering everything from supported video platforms and transcription accuracy to subtitle formats, translation, SEO content, pricing and privacy.

Getting started

ScribeVids is an AI platform that turns video links and uploads into accurate transcripts, multi-language subtitles (SRT, VTT, ASS), translations in 65+ languages, and SEO-ready titles, descriptions, blog posts and social copy. It is built for creators, marketers, podcasters, agencies and education teams.

No. ScribeVids runs in any modern browser. It can also be installed as a Progressive Web App on Android and desktop, and there is a native iOS app on the App Store.

You can transcribe a short video as a guest, but creating a free account unlocks longer videos, saves your history, and is required for subtitle exports and SEO content generation.

Paste any YouTube, TikTok, Instagram, Vimeo, X or LinkedIn URL on the homepage, or upload a file directly. ScribeVids fetches the audio, transcribes it with Groq Whisper-large-v3-turbo, and shows the result in under a minute for typical videos.

Yes. The free tier covers short videos and basic exports so you can try the full workflow without entering payment details.

Most videos under 30 minutes finish in 30-90 seconds because the primary engine is Groq Whisper, which runs roughly 10-20× faster than standard Whisper deployments.

Yes. ScribeVids supports Sign in with Google, Apple, LinkedIn and Twitter, plus traditional email and password accounts.

They are listed under My Videos in the navigation. You can search, filter by date or status, re-export, edit or delete any transcript from there.

Supported platforms & file types

YouTube (videos, Shorts, premieres, live replays), TikTok, Instagram (Reels, posts, stories), Vimeo, X (Twitter), and LinkedIn. New sources are added regularly.

Yes. ScribeVids accepts mp4, mov, webm, mkv, avi and most common video containers, plus mp3, m4a, wav and other audio formats. The upload limit is 2 GB on paid plans and 200 MB for guest users.

Public and unlisted videos work directly. Private videos and members-only videos cannot be fetched without authentication — download the file from YouTube Studio and upload it instead.

Yes. Public Reels work directly. For private accounts you need to provide your own Instagram cookies — see the Instagram cookies help article for the exact steps.

Yes. Upload the .mp3 or .m4a file directly, or paste any direct audio URL. The same SEO content tools that work for video work for podcast episodes.

Yes. Export the recording from Zoom or Meet as an .mp4 or .m4a file and upload it. ScribeVids will transcribe and add speaker labels if multiple voices are present.

Live streams cannot be transcribed in real time, but the saved replay (VOD) of a YouTube live, Twitch stream or X Space can be transcribed once it is public.

Yes. Once the Space replay is published, paste the X URL into ScribeVids the same way you would a tweet. Live, in-progress Spaces are not supported.

Public Vimeo videos work directly. Password-protected or domain-restricted Vimeo videos need to be downloaded first and uploaded.

Yes. ScribeVids fetches the audio track regardless of watermark, so the transcription works whether the source has a watermark or not.

Free accounts can transcribe videos up to 30 minutes, Pro up to 4 hours, and Enterprise up to 8 hours. Longer files can be split or processed in chunks.

Yes. Short-form videos transcribe in seconds, and ScribeVids automatically detects vertical/portrait aspect ratio when generating burned-in subtitles for them.

Accuracy & quality

On clear audio in supported languages, ScribeVids reaches 95%+ word accuracy. Accuracy depends on audio quality, accent, background noise and the use of technical jargon or proper names.

Groq Whisper-large-v3-turbo is the primary engine. OpenAI Whisper is the automatic fallback if Groq is unavailable. Both are state-of-the-art speech-recognition models.

Groq runs Whisper on custom Language Processing Units (LPUs) optimized for inference. The model architecture is the same but it executes 10-20× faster than on standard GPUs.

Yes. Every word has a precise start and end time, which powers accurate subtitles, click-to-seek transcripts and segment-level translation.

Yes. Speaker diarization labels each segment with Speaker 1, Speaker 2 and so on. You can rename speakers in the transcript editor and the labels propagate to all exports.

Whisper-large-v3-turbo is trained on 680,000 hours of multilingual audio and handles most major regional accents well — US, UK, Australian, Indian, Caribbean and African English. Strong dialects may need manual correction.

Common technical terms are recognized. Niche jargon, brand names and unusual proper nouns sometimes need correction. Use the transcript editor to fix them once and they propagate to all subtitle and translation outputs.

Yes. The built-in transcript editor lets you fix typos, remove filler words, polish grammar with AI assistance, label speakers and re-time segments. Edits flow through to subtitles and translations.

Word accuracy can drop to 70-85% on poor audio. For best results, reduce background noise before upload — even a quick pass through a free noise reducer noticeably improves output.

By default yes, because Whisper is verbatim. The transcript editor has a one-click filler-word remover, and the burned-in subtitle pipeline can skip them automatically.

Yes. Capitalization and punctuation are added automatically based on phrasing and pauses. AI grammar polish is also available for one-click cleanup.

Yes. The transcription quality analyzer flags low-confidence segments so you know where to focus when reviewing.

Subtitles & captions

SRT, VTT and ASS. SRT is the universal format for YouTube and most editors, VTT is the web standard for HTML5 video, and ASS supports advanced styling and positioning.

Yes. The auto-burn feature renders hard-coded captions into the video file with custom font, color, position, outline and background, and gives you the finished MP4 to download.

In YouTube Studio go to Subtitles → choose your video → Add → Upload file → select With timing → upload the .srt file. The captions appear in YouTube within a few minutes.

Use burned-in (hard-coded) for short-form social where most people watch on mute. Use soft (.srt or .vtt) for long-form on YouTube and Vimeo so viewers can toggle them and search engines can index the text.

Yes. The auto-burn editor controls font family, size, color, outline, background, position and per-line alignment. You can save styles as presets to reuse across videos.

Yes. Word-level timestamps from Whisper are converted into per-line subtitle cues with appropriate reading-speed pacing.

Yes. The transcript editor lets you split, merge and re-time any subtitle cue, then re-export.

Subtitles are a transcription of the spoken dialogue. Closed captions add non-speech audio cues like [music playing] or [door slams] for deaf and hard-of-hearing viewers. ScribeVids exports both styles.

A common standard is 32-42 characters per line, max two lines on screen, and 1-7 seconds per cue. ScribeVids defaults to these values and you can adjust them per project.

Yes. Generate translations for two languages and combine them into a single dual-language SRT/VTT in the export step — useful for language-learning channels.

Yes. Use the transcript editor to select a range, then export only that range as a clip-ready SRT or VTT.

Yes. The auto-burn pipeline can produce karaoke-style word-by-word highlights and pop-on captions tuned for vertical short-form video.

Export the SRT from ScribeVids, then in Premiere use File → Import → choose the .srt; in Final Cut Pro use File → Import → Captions → choose the .srt. The subtitles appear as an editable caption track.

This usually happens when the source video and the audio track have a small offset. Use the transcript editor to nudge the entire subtitle file forward or back by a fixed offset, then re-export.

Translation

65+ languages including Spanish, French, German, Portuguese (BR and PT), Italian, Dutch, Polish, Russian, Ukrainian, Arabic, Hebrew, Turkish, Hindi, Japanese, Korean, Chinese (Simplified and Traditional), Vietnamese, Thai, Indonesian, Filipino and many more.

Translations are generated by GPT-4o using the original transcript and timestamps. Quality is comparable to a strong human translator for general content, with some loss of nuance for highly idiomatic or culturally specific material.

Yes. Pro accounts can batch 3 languages per video and Enterprise accounts can batch 50, all generated in parallel from a single transcription. Outputs include SRT, VTT and TXT for every language.

Yes. Each translated segment keeps the original start and end time so the subtitles stay in sync with the video.

Yes. The transcript itself can be exported in any target language as plain text, Word or JSON.

Yes. Arabic, Hebrew, Persian and Urdu transcripts and subtitles are generated with correct RTL flagging. Players that respect SRT/VTT direction tags will render them right-to-left automatically.

Yes. Whisper auto-detects the source language and ScribeVids will translate from any of its 99 supported languages into any of the 65+ target languages.

GPT-4o translates idioms in context rather than word-for-word, but cultural references sometimes need a human pass. Domain-specific terms can be locked using a custom glossary.

Yes. Add a glossary of "do not translate" terms in your account settings and ScribeVids will preserve them across all target languages.

Translation length follows transcription length: up to 4 hours on Pro and 8 hours on Enterprise. Each minute of source audio takes a few seconds to translate.

Yes, both. Choose Chinese (Simplified) for Mainland audiences and Chinese (Traditional) for Hong Kong, Taiwan and Macau audiences. Both keep the original timestamps.

Yes. Each translated language opens in the same transcript editor, so you can polish phrasing or fix specialized terms before re-exporting.

SEO content & repurposing

Optimized titles, meta descriptions, tag/keyword lists, full blog posts (1,500-2,000 words), social media captions for X, LinkedIn and Instagram, and email newsletter copy — all derived from the transcript.

Yes. The content repurposer turns a single transcript into a blog post, newsletter, LinkedIn article, Twitter thread, podcast script, or ebook chapter, each tuned for that platform.

Yes. AI chapter generation analyzes the transcript and produces titled timestamps you can paste into a YouTube description for automatic chapter markers.

Yes. The keyword research tool surfaces the dominant topics, entities and terms from your transcript so you can target them in titles, descriptions and tags.

Yes. The summarizer produces three formats: an executive summary, a bullet-point list of key takeaways, and an action-items list — perfect for newsletters and LinkedIn posts.

Yes. The SEO metadata generator writes a keyword-rich title, full description, hashtag set and tags optimized for YouTube discovery, plus chapter markers.

Yes. Hashtag suggestions are platform-aware and limited to the optimal count for each network (around 30 for Instagram, 4-6 for TikTok, 1-2 for LinkedIn).

Yes. The repurposer outputs a sequenced thread with hooks, body tweets and a CTA, sized for the 280-character limit.

Yes. The repurposer outputs structured show notes with episode summary, key timestamps, guest bios and links — ready to paste into your podcast host.

Use the keyword research and SEO metadata tools to write a keyword-targeted title and description, upload the SRT for an indexable transcript, and add the AI-generated chapters. The full workflow is in the YouTube transcription guide.

Yes — when the transcript is published as text on a page (or as a VTT/SRT track), Google indexes it and surfaces it in regular and video search results.

Yes. The blog-post repurposer generates a long-form, structured article with H2/H3 headings, internal subheading anchors and natural keyword usage.

Yes. The video insights dashboard tracks views, engagement, sentiment and virality potential per video and across batches.

Yes. The video research hub compares your transcripts against competitor videos and trending topics to surface unanswered questions and missing keywords.

Bulk processing & exports

Yes. Pro accounts can queue up to 50 videos per batch and Enterprise up to 100. Each gets the full transcript, subtitle and SEO bundle generated in parallel.

Plain text, SRT, VTT, ASS, CSV, JSON, Excel (.xlsx), Word (.docx), and a single ZIP that bundles every output for one or many videos.

Yes. The export manager can package every transcript, subtitle file and SEO bundle for an entire batch into one ZIP download.

Yes. Paste a list of URLs or upload a one-column CSV in the bulk processor and ScribeVids queues every row for processing.

Yes. Pro and Enterprise accounts can submit a batch and let it process unattended. Email notifications fire when each video and the full batch complete.

Free covers a small monthly quota, Pro covers heavy daily usage with generous monthly limits, and Enterprise is essentially unmetered for fair-use volumes.

Yes. The Word export includes formatted speaker labels, timestamps and paragraph breaks ready for editing or sharing.

Yes. JSON exports include full word-level timestamps, speaker labels, segments, and metadata so you can feed them into your own tools or pipelines.

Pricing & plans

Free for short videos and basic features. Pro is $19/month and includes bulk processing, multi-language subtitles, SEO content and priority queue. Enterprise is $99/month and unlocks 50-language batches, team seats, larger uploads and priority support.

Yes. Annual subscriptions are billed at a reduced effective monthly rate. Pricing details are on the pricing page.

Yes. Subscriptions can be cancelled at any time from your account dashboard. Access continues until the end of the billing period.

Refund eligibility depends on usage and is reviewed on a per-case basis. Contact support within 14 days of purchase to request a refund.

The free tier serves as an open-ended trial of the core features. Pro features can be tested with promo codes — contact support if you are evaluating ScribeVids for a team.

Yes. Verified students, educators and registered non-profits get 50% off any plan. Contact support with proof of status to apply the discount.

All major credit and debit cards, plus Apple Pay, Google Pay and Stripe-supported regional payment methods. Enterprise contracts can also be paid by invoice.

Yes. Subscriptions auto-renew at the end of each billing period unless you cancel beforehand. You can cancel at any time and keep access until the period ends.

No, ScribeVids is subscription-only because the underlying AI model and platform costs are recurring. Annual billing offers the largest savings.

Yes. Plan changes are pro-rated and take effect immediately. Upgrades unlock new features instantly; downgrades apply at the end of the current billing period.

Privacy, security & data

Yes. Every transcript, subtitle and uploaded file stays inside your account. ScribeVids does not share or resell your data, and does not use it to train any model.

Transcripts and metadata are stored in a managed PostgreSQL database hosted in the United States. Uploaded video files are processed for audio extraction then deleted within minutes.

Uploaded video files are deleted immediately after audio extraction. Only the audio (used for transcription) and the resulting transcript are retained.

Yes. EU users can request data export or deletion at any time from their account. Cookies are first-party and analytics use a privacy-focused, cookieless tool.

No. Your transcripts are not used to train any AI model — neither ScribeVids own systems nor any third-party provider.

Yes. Account deletion from the dashboard removes your user record, all transcripts, all uploaded artifacts and all subscription history.

Yes. All traffic is served over HTTPS with TLS 1.2+ and HSTS. Database connections and storage are encrypted at rest.

Yes. Passwords are hashed using a modern, salted hashing algorithm. Plaintext passwords are never stored or logged.

Yes. ScribeVids supports COPPA (under-13 protections), CCPA (California consumer rights), DMCA takedown processes and other major content compliance frameworks.

Yes. Enterprise customers can sign a DPA covering GDPR Article 28 obligations. Contact support to request the current template.

Troubleshooting

The most common causes are: the video is private or members-only, the video has age restrictions, or YouTube is rate-limiting requests. Try waiting a minute and retrying, or download the file and upload it directly.

Whisper sometimes misspells unusual proper names. Use the transcript editor to correct them once — fixes propagate to all subtitle exports and translations.

Yes, but accuracy will drop. For best results, ensure the audio has clear voice and minimal background music or noise. Re-recording or noise-reducing the file before upload usually helps more than any post-processing.

Email support through the contact page. Pro and Enterprise users get priority response times. Bug reports with the job ID help us debug fastest.

Very large files can take a few minutes to upload before processing starts. If a job is stuck for more than 30 minutes, the watchdog auto-marks it failed and you can retry. Contact support with the job ID if it keeps happening.

This usually means the transcription completed but the audio had no detectable speech. Try a different segment of the video or contact support with the job ID.

TikTok occasionally rate-limits scrapers. Wait a few minutes and try again. If it persists, download the TikTok video to your device and upload the file directly.

Default reading speed targets 32-42 characters per line. Adjust the max characters or characters-per-second in the auto-burn settings, then re-export.

Whisper auto-detects the language. For mixed-language audio or short clips, manually set the source language in the job settings before submitting.

ScribeVids enforces per-user rate limits to keep the service fast for everyone. If you hit a limit, wait a few minutes or upgrade for higher quotas.

Check your spam folder and add no-reply@scribevids.com to your contacts. Corporate firewalls sometimes block outbound emails — contact support if the issue persists.

Stripe declines come from your card issuer. Try a different card, contact your bank, or use Apple Pay / Google Pay instead. ScribeVids never sees your card details directly.

Account & team management

Enterprise accounts can invite team members from the Team page in account settings. Each invited user gets their own login and shared access to project workspaces.

Owner, Admin, Editor and Viewer. Owners and Admins can manage billing and members; Editors can create and edit transcripts; Viewers can only read and export.

Yes. The Collaboration Hub lets teams group videos into projects, assign tasks, leave comments and approve translations and subtitle edits in a review workflow.

In Account Settings, change the email and confirm via the verification link sent to the new address. Existing transcripts and subscription remain attached to your account.

On the login page click Forgot password. A reset link is emailed within a minute. The link expires after 30 minutes for security.

Yes — most users keep one account per email address. For agencies managing many client brands, Enterprise accounts support per-project isolation inside a single login.

Two-factor authentication is available in Account Settings → Security. Use any TOTP authenticator app like 1Password, Authy or Google Authenticator.

Yes. Account Settings → Privacy → Request data export produces a ZIP of every transcript, subtitle, translation and subscription record on file. Required for GDPR compliance.

API, integrations & automation

Yes — a REST API is available for Enterprise plans, covering job submission, status polling, transcript retrieval, translation and export. API keys are managed in Account Settings.

Yes. The REST API works with any automation platform that supports webhooks and HTTP requests, including Zapier, Make (Integromat) and n8n.

Yes. Configure a webhook URL in Account Settings → API and ScribeVids will POST a JSON payload when each transcription, translation or batch completes.

Yes. Use the API plus a Zapier or Make recipe to forward completed transcripts and SRT files to WordPress, Webflow, Ghost, YouTube or anywhere else.

Not natively — but the webhook payload works directly with any Slack incoming webhook URL, so you can announce completed jobs in seconds.

Self-hosting is not currently offered. Enterprise customers can request dedicated processing capacity in our managed cloud instead.

Yes. Default API rate limits depend on your plan tier. Enterprise customers can negotiate higher limits or burst quotas for high-volume workflows.

Yes — all payments are processed through Stripe. ScribeVids never sees or stores card numbers.

Mobile & PWA

Yes. ScribeVids is on the iOS App Store as a native app, and is installable as a Progressive Web App on Android and any modern desktop browser.

Open scribevids.com in Chrome, tap the menu, then Add to Home screen. The app installs an icon that launches in standalone mode like any native app.

In Chrome, Edge or Brave, click the Install icon in the address bar (or the menu → Install ScribeVids). The app gets its own window, icon and dock shortcut.

Yes. Allow notifications when prompted and you will get push alerts when jobs complete and when team members assign you a task.

On phones the workflow is intentionally focused on transcription, subtitles and translation — the most-used mobile actions. SEO, bulk processing and analytics tools live in the desktop and tablet view for screen-real-estate reasons.

Yes. The mobile app has a native audio recorder for voice notes and interviews — recordings transcribe in seconds.

You can browse cached transcripts offline, but new transcriptions and translations need an internet connection because they run on cloud AI models.

Tap Share on any transcript to use the native iOS or Android share sheet — send by email, message, or any installed app.

Use cases & workflows

Yes. The standard YouTuber workflow is: transcribe your video, upload the SRT for indexable closed captions, paste the AI-generated chapters into the description, and publish the full SEO blog post on your site for backlinks. The YouTube guide walks through every step.

Yes. Upload the episode audio, get a clean transcript, generate show notes, pull out 5-10 highlight quotes for social media, and translate the transcript for international audiences — all in one workflow.

Yes. Bulk processing, team workspaces, white-label exports and per-client project isolation make it well-suited to agencies managing many client video libraries.

Yes. The Learning Management module turns any video lecture into a course module with auto-generated quizzes, summaries and certificates. Subtitles also make courses accessible.

Yes. Speaker diarization labels each interviewee, the transcript editor makes pull-quote selection fast, and word-level timestamps make it easy to verify quotes against the source recording.

Yes. Captioning every video with accurate, time-synced subtitles and providing a downloadable transcript is the foundation of WCAG 2.1 AA video accessibility.

Yes. Bulk processing transcribes dozens of interviews in parallel, and JSON exports plug straight into qualitative analysis tools like NVivo, Atlas.ti or Dedoose.

Yes. Upload the recording, get a transcript, generate chapter timestamps for the replay, write a recap blog post, and create translated subtitles for global attendees.

Yes. Upload sales call recordings (with consent) and get fully-searchable transcripts. The summary and action-items outputs work especially well for CRM follow-ups.

Yes. Use the highlight summarizer to identify the strongest 30-60 second moments, then export burned-in vertical subtitles for those time ranges.

Comparisons

ScribeVids is built around video, multilingual subtitles and SEO content — not just meeting transcription. Otter.ai is stronger for live meeting capture; ScribeVids is stronger for video URLs, multi-language subtitle batches and creator workflows. Full comparison is in the help center.

Rev offers human transcription at higher price and slower turnaround (24+ hours). ScribeVids uses AI for near-real-time results at a fraction of the cost, with a transcript editor for the small accuracy gap.

Descript is a video and podcast editor with transcription built in. ScribeVids focuses on transcription, multilingual subtitles, translation and SEO content — not video editing. The two tools complement each other.

Local Whisper is free but requires a powerful machine, manual setup, and gives you only the raw transcript. ScribeVids adds the 10-20× faster Groq engine, multilingual subtitles, translation in 65+ languages, SEO content generation, team collaboration and exports — all in a browser.

YouTube auto-captions are free but have known accuracy gaps, lack speaker labels, and only output to YouTube. ScribeVids is more accurate, supports 65+ language translations, exports SRT/VTT/ASS, and works for any platform — not just YouTube.

Both are AI-first transcription tools. ScribeVids is faster (Groq Whisper) and includes multilingual subtitle batching and SEO content out of the box. Sonix has a deeper editor but a higher per-minute cost.

Trint is enterprise-priced with strong newsroom features. ScribeVids covers similar transcription and translation needs at a much lower price point and adds creator-focused SEO and subtitle tools.

Free tools usually stop at the transcript. ScribeVids covers the full pipeline — accurate transcription, multi-language subtitles, translation, SEO content, bulk processing and exports — so you do not have to stitch together five tools.

Still have questions?

Contact support or jump straight in and try ScribeVids free.