Friday, May 30, 2014

Watch Out for Fake Captioning!

What is #fakecaptioning? Fake captioning is when a producer does not edit the automatic captions generated by YouTube but instead simply downloads, then uploads, the automatic caption transcript. YouTube can not tell the difference between an edited autocaption transcript and an un-edited autocaption transcript. YouTube automatically indexes the captions if they are seen as "edited," applies a cc icon to the video's listing, AND uses the captions in search engine results.

So if a producer is engaging in #fakecaptioning, they are cheating. They are getting a search engine advantage that they do not deserve.  Plus, they are fooling the people who really need the captions - deaf and hard of hearing people - into thinking that they are truly captioning their videos when they are not!

The popular YouTube news channel The Young Turks is guilty of engaging in #fakecaptioning. Below is a sample of The Young Turks' video listing page, showing the CC icon for some (they aren't even captioning consistently!) of their videos.


Caption Action 2 has teamed up with Subtitle YouTube (a group of volunteers that caption popular videos when the producers won't caption) to bring you this blog post.  Below are two videos. The first video is the video from The Young Turks channel on YouTube, titled "The Heart-Warming Science Of Gay Dad Brains." The second video is the same video captioned by Subtitle YouTube. Click on the image to view the video (we are not able to embed it). It should be very obvious at the very start of the video that The Young Turks is using the automatic captioning without any editing, passing them off as real captions - #FAKECAPTIONING!

Just a few of the indicators that it is actually an unedited autocaption transcript, and these are from the first 30 seconds of the video:
  • "game" fathers instead of gay
  • repeats the word patterns in the same caption ("patterns brain patterns"
  • no punctuation at all - can not tell where one sentence ends and another begins. ("straight men and straight women this is really a fascinating study because") There is no break between "women" and "this."
  • the inclusion of a name in place of a word. ("what the researchers did Anna")
  • no correction for grammar. ("brains are new straight fathers")
  • short-stand-alone phrases hanging on the screen. ("the brains" appears all by itself)
  • incomprehensible statements ("love new gave others")
..and that's in just the first 30 seconds. Now watch the video from Amara below to see the difference it makes when the autocaptions are edited.


http://www.amara.org/en/videos/Kw7SQ7neM7Fu/info/the-heart-warming-science-of-gay-dad-brains/?tab=video

If this upsets you - and we do not want #fakecaptioning to spread - let The Young Turks know how you feel, via either a comment added to any posting on their Facebook page or their Twitter account! Use the hashtag #fakecaptioning in either your tweet or facebook posting.

Why should you contact @TheYoungTurks or post a comment on their facebook page? Here are some good reasons, and the reasons are different depending on who you are:

  • If you are deaf/hard of hearing: You need the damn captions to be edited in order to be able to understand the video. 
  • If you are a professional captioning service provider: One of your key marketing tools for youtube video captioning is those improved search engine results from edited closed captions. What The Young Turks are doing gets them the improved search engine results so they don't "need" your services.
  • If you are a volunteer captioner or subtitler: You know how important quality captions are, and you care about that. You know that your "customers" appreciate your efforts.
  • If you are a web TV producer that properly captions your web TV show (either by yourself or by paying for the service): You don't like to see competing web TV programs getting an unfair advantage by faking the captioning. You don't like to see another web TV producer cheating viewers.

Special note: Subtitle YouTube wrote that they did not "caption this episode using (the) normal style (with the Amara handguide). (SubtitleYouTube) mimicked the automated captions format, with their odd line breaks and character counts so that (Caption Action 2) could take screen shots clearly (to) illustrate the differences between the bogus captions and correctly done captions."

Join Caption Action 2

Join Caption Action 2 on Facebook!

1 comment:

  1. Thanks for bringing the issue to our attention. I'm deaf and yes I have discovered that automatic captioning on YouTube can be difficult to follow. It's to my understanding that Google incorporated the automatic captioning feature to help combat the problem with most uploaders who did not bother to make caption available for their video. The automatic caption is entirely generated by speech recognition engine and is, of course, nowhere as accurate as the edited version done by human hand.

    Regarding "fake captioning", I think we should be asking ourselves if it is acceptable to the deaf and hard of hearing communities to allow machine to generate caption if the uploader didn't bother to upload the transcript with their video.

    I understand the frustration with non-edited caption but I see this as a perfect opportunity for Google to develop their own speech recognition engine, which *should* improve over time. In an ideal world, the machine will accurately translate everything but we're not there yet - maybe not until another 10 years or so.

    ReplyDelete