How to Transcribe Audio to Text (Video Transcription Tutorial!)

– Are you looking to
transcribe your videos or your podcasts so you can easily repurpose your content? Well in this video,
we’re gonna take a look at how to transcribe audio to text and the best options to
do it, both free and paid, so that you’re covered, no
the description box below. So let’s jump into it. Transcribing videos or podcasts is a great way to easily repurpose your existing content
or simplify the process of creating video or blog descriptions. Now there’s a lot of options out there when you are looking to transcribe speech to text in any format ranging from free transcription software to paid automated services and manual transcriptions as well and there can also be big differences in the accuracy of the output
that you’ll receive back as well as the cost of each option. Depending on your project and your budget, each solution does have it’s place. I use a mix for different projects and depending on the
level of accuracy required and how fast we actually
need the transcription returned to us. So in this video, we’ll
run through our favorite options for transcribing
videos on every budget and any project, no matter how accurate you need your results. While we’re running through our picks, let me know in the comments if you get your audio or video transcribed and what your favorite
transcription services are and why. There’s new ones popping up all the time so your suggestion might help
the rest of the community and I’d definitely be interested in seeing what else is out there, as well. So we’re gonna start off
with the paid services first and then move into the free options. Now it is important to note across all transcribing options and services, that transcribing works best
the clearer the audio is. So ideally you’re providing audio or video with minimal background
noise and no music, so just talking videos, ideally
to get the best results. Now obviously, that doesn’t cover off every transcribing solution,
but wherever possible, try to have your audio, the spoken parts, as clear as possible. So with the paid options, in most cases, it is the case of the more you pay or the more expensive
the service offering is, the higher the accuracy of the results. So looking at the paid options, they really fall into two categories. Manual and automated. Now in this automated category, there’s also two subcategories. So you can actually
automate your transcribing through websites or you can
actually use computer software. Obviously the websites are platforms where you just upload your documents to a website and that will process everything for you and
you will be sent back your transcription. Software, on the other hand, is something that you’ll need to pay for and download and install on your computer and run your transcribing
through that, that way. So we’ll start out, then,
looking at the websites and with the websites
you’ve got things like Temi or Spext and how these websites work is that once you’ve created an account, you can send through
either a link to your video or you can upload your entire video or you can upload an
MP3 or an audio version of your video for the
website or web platform to transcribe your audio from there. So generally, again, with
these software solutions, the turn around time is
normally really, really fast as there’s no human element in here, you’re not gonna be waiting for someone to sit there, download, and play through your audio or your video
file to type it all out. It is all automated, so in most cases with these services, your transcribing will actually start immediately. So they’re really, really fast to get the file back to you at the end. But a big thing to note with these software platforms or services, is that they really need clear audio to be able to get better
results or good results. If you’ve got things
like background music, or there’s a heap of background noise, if the people talking have strong accents or there’s a lot of wind,
or anything like that that’s really gonna start
to drown out your audio, the spoken parts, in your
video, then it’s gonna make it really hard for the software to decipher what you’re actually saying. But on the plus side,
besides being really fast, these software platforms
are normally really cheap. So you’ve got Temi that starts
at around 10 cents per minute and you’ve got Spext that starts
around 25 cents per minute and there’s a heap of other
solutions out there as well, but these two would be
my top recommendations of where you should start out. So for me, I’m a big fan of Temi and I’ll use it on projects where accuracy isn’t too important, but where it makes it handy, especially on long-form editing projects like documentaries or corporate work where it’s handy to have
a full transcription of the videos that you’re working on, the videos that you’re
editing and cutting down to be able to quickly find things. Where did this person say this? Okay that’s here, find
that in the timeline really, really quickly. So for those sorts of things
where it doesn’t matter if there is a few typos
or even a few sentences that are out of whack in there somewhere, then that’s where Temi is perfect for those type of projects, but I wouldn’t use it if I really needed a high level of accuracy
on the transcription or if the videos I had already had background music or a
heap of people speaking in there, it’s just
gonna be too complicated and the results aren’t gonna be fantastic. The next option you’ve
got for automatically transcribing your videos
is to use software that you can install on your computer. Now really this software
is just an interface between some of these web platforms, but it will give you a
great amount of control. In a lot of cases, it will also give you a higher level of accuracy, as well. Now once again, there are a few applications out there for doing this, but the stand out and the
best one that I’ve seen in a long time is a plugin
for Adobe Premier Pro so, sorry but this is only
for Adobe Premier Pro users, but it’s called
Transcriptive and it’s from a company called Digital Anarchy. It sells for $299, which
may seem expensive, but stick with me here
because this will give you up to 95% accuracy, it
can transcribe 60 minutes of video in around 10
minutes which is insane, and because it’s built into Adobe Premier, it means that the whole
exporting and uploading and everything is automated
through Adobe Premier. And it means once you get
your transcribed video back, it’s already inside of Adobe Premier that you can use as either
markers in your video, or reference points where it makes it so much easier to find
segments in your video because it can be linked
directly with time code to your video file, so
it’s an amazing tool. So how it works is it connects through to either IBM’s A.I. engine called Watson, or a platform called Speechmatics. Now out of the two of those,
Watson is less accurate but it actually gives
you 1,000 minutes free transcribing per month and after that, if you’re doing a heap, then
it’s only two cents per minute for any additional transcribing. The other option, Speechmatics is the much better solution. Now this is where you get
up to that 95% accuracy and in all of our tests is actually works on noisy video files as well or video files with a
heap of background noise and background music and
I was really blown away with the accuracy on this. And Speechmatics only works out at around seven cents per minute to transcribe your video files. So when you’re using this
transcriptive plugin, you can use it to easily create captions, subtitles, word documents, it works with multiple presenters as well, and it is really, really fast. We did a test on both
Watson and on Speechmatics with one of our YouTube videos, just to see how it would go so it had the background music
and everything in there just as one of these complete videos does, and the results were
almost chalk and cheese. For any quiet parts where
there was no background music, Watson did okay, but across the board, Speechmatics was far more accurate and it was actually much faster to do the transcribing as well. It actually returned the 24 minute, 22 or 24 minute video back in under a couple of minutes, so that’s just crazy. So obviously at that price point of $299 for this plugin and having it only work in Adobe Premier, it’s a great solution if you’re using Adobe
Premier and doing a heap of long-form editing projects, one’s where, literally,
with a click of a button, you can transcribe your entire sequence or project that you’re working
on in a matter of minutes. So it’s an amazing tool for transcribing your edits or your dailies for things like documentaries, short-films,
long-form video projects or long-form sales videos where it’s much easier to work off a paper edit with clients, or your team, than trying to find things manually inside of your editing project. But obviously, if you’re not using Adobe Premier Pro, or
you don’t wanna spend the $299 upfront on your transcribing, then this won’t be the
best solution for you. So those are the paid
options that we recommend you check out and there’s a
couple of free options as well. Now, obviously as I said in the beginning, the free ones definitely don’t have the accuracy that you will
have in the paid solutions, but they still might be enough
to do exactly what you want. So first off, on the free side, is YouTube’s Auto Transcribe function. Now how this works is when you upload your video to YouTube,
it’s not always instant, can take up to 12 hours, and on some cases I’ve heard it taking longer. YouTube will automatically
transcribe your videos. Once that’s actually done, you can log in and you can download that text that it’s actually
transcribed from your video and copy and past it into a word document or anywhere else that you
wanna use it from there. So that’s one way where
you actually don’t even need to do anything. The other way is to use something like Google Voice or Siri to do
the transcribing for you. Now we did do a video
quite a long time ago. It was one of the first videos
on this channel I think. Talking about transcribing your videos using Google Voice and Google Docs. If you haven’t seen it
yet, I’ll put a link up in the cards, so all you need to do is open some sort of text app, a notes app or some sort of word processing app on your smartphone, and
instead of typing in on the keyboard, you’ll
press the microphone button, which is for voice input. So all you need to do then is press that microphone button, hold your phone up to a computer or something
that is playing through the video that you wanna transcribe and the transcribing will start. Now obviously here, this is again, really dependent on the amount of background noise in the area that you’re going to be recording through, but also in the video, itself. If there is a heap of music in it, then this really isn’t
going to work too well, but if the music is pretty quiet, or there’s no music at all, and minimal background noise, then you can actually get
some pretty decent results. So that’s using your phone, but in most cases you can actually get better results using your
desktop computer, as well. Now this is something that
we’ve found works best in Google Chrome and all you need to do is open up Google Drive,
create a new Google Document, select Tools at the top and
then go down to voice typing. Then all you need to do
is hit that microphone and then hit play on a
video that is also open on your computer and the
transcribing will start. Now depending on how
your computer is set up and how your microphones are set up, in some cases, you
might get better results playing your video file through your smartphone and holding that up near your microphone on your computer and transcribing that way. One of the biggest thing we’ve seen with comments on our previous video talking through this, was that if you have any sort of accent or
if your Google account isn’t set to English as
your primary language, then you may not get good
results with this process or in some cases, you may not have access to the feature at all. So again, you need to be using Chrome on Mac or PC and you need to have your primary language
setting set to English in your Google account. So those are our top recommendations for automated solutions when it comes to transcribing. So now we’re gonna look
at the manual options. These are the ones that
are done by a human. The biggest benefit here is
the higher level of accuracy. But the downside is that generally these will take longer to do. So for this, your first option is to head over to Fiverr or Upwork.com and list a job and
there’ll be heaps of people on there that are willing to transcribe your video for not a lot of money, but really in our experience, we found some really, really good ones, and really cheap ones, and in other cases, the communication has been terrible and the end result hasn’t
been that great either. So they definitely can be hit and miss. Now there are lots of dedicated website to this as well, where your transcribing will be done by a human and with some sort of quality control in there as well and our pick is currently Rev.com. So once again, how it works is that you’ll create an account with Rev.com. You’ll either link to your video, you’ll upload your
video, or you will upload an audio file that you wanna transcribe. You’ll submit that to Rev.com as a job and within 12 hours, you’ll have your transcribed audio or video back to you. So the big, big pros with this is that it is done by a
human, so we’ve already said the accuracy is much, much better. There is quality control in here. There is a review process,
if you’re not happy you can go back and get
them to make changes. The range or the options that you have on outputs, you can
request different versions. Some with time codes,
some as closed captions, some as subtitles, the
options that you have to receive your transcribed
files back is huge. Even if you want a word document that’s laid out in a particular way, then they would do that as well. Because you actually
get to provide a brief with your project or task telling them exactly what you want. So the cost for Rev.com is $1 per minute. So it is a little bit more expensive than the other automated options, but given the fact that there is that quality control, that
you are actually getting your audio or your video transcribed by a real person that
can understand English and different accents is huge. And your end product is, in most cases, going to be a lot better. And also, for any video or audio that has multiple speakers
or multiple presenters, I’ll hands down go with Rev.com because of that much better accuracy. The moment you’re throwing
in different speakers and different accents and things, it does make a huge difference, even though some of
those automated services did say they supported multiple people, multiple speakers, the output definitely isn’t as good when you’ve got them in your video or audio. Now we do actually have
a full walk-through on how we use Rev.com
for our YouTube videos and I will put a link
up in the cards as well. So as you can see there’s
quite a few options out there to get your
videos or your audio files transcribed either
automatically, manually, or done through your phone. So for us, we really just use a mix of those solutions
depending on what we need, the accuracy that we need and how fast we need that transcription back. So for something really, really quick, where if the accuracy isn’t 100%, doesn’t really matter, we can make those minor changes, I would either use my smartphone, or use Google Docs to do the transcribing,
or I would use Temi for 10 cents per minute. For really large video editing projects where the accuracy
doesn’t need to be 100%, but we’re gonna be constantly changing and updating those documents to be able to create new
ones really, really fast off the edits as we’re going, I would hands down use
the transcriptive plugin for Adobe Premier Pro and I would use Speechmatics inside of that. And for anything where we 100% need the accuracy, for anything
that’s going out public, instead of just internal editing or working with teams, then it hands down goes to Rev.com. So while it does take a little bit longer than some of these really
fast online solutions, I’m happy to wait because
it’s not that long, but the level of accuracy, and also the level of formats and control that you have
over the transcribing process is really second to none
and you can really get almost anything you want created by them once they’ve actually
done the transcribing. So whether it is a subtitle, captions, word document, formatted
exactly how you want it, then, yeah, it’s Rev.com. And even for every video
on this YouTube channel all our subtitles or
captions are actually done through Rev.com as well. Now transcribing your YouTube content can also help you boost
your YouTube rankings. We put together a video
which is linked on screen that explains how it works and why you should be doing it and exactly how we create subtitles for our YouTube content
to help you get started. So make sure you check it out and I’ll see you soon.

