In the age of surveillance capitalism a market has emerged for messaging platforms offering end-to-end encryption — WhatsApp, Wickr, Signal. Meanwhile, the apps used to automatically transcribe conversations have largely flown under the radar.
The AI-powered tools have saved journalists, researchers and investigators an enormous amount of time, sparing them from the burden of taking interview notes, but at what cost?
“We recommend avoiding transcription [apps] altogether if your audio, in the wrong hands, could put people at risk,” Freedom of the Press’s report into the app’s security concluded in January.
The report assessed the five most popular automated transcription apps — Descript, Otter, Rev, Temi Trint — on the risks they pose to journalists and their sources.
The apps were ranked on their reliance on third-parties, encryption, transparency and whether they offer 2-factor authentication.
No end-to-end encryption
While apps like WhatsApp, Wickr, Signal offer end-to-end encryption, the encryption offered by the five most popular automated transcription apps doesn’t go beyond using AES-256 encryption to protect data stored in their Amazon Web Services.
This helps protect the audio data from attackers outside their organisation, but the Freedom of the Press Foundation’s report notes that this means that the companies themselves have “the technical ability to access the audio you've uploaded.”
“To provide these services the companies also have the technical ability to read the transcripts the company created and stored.”
Reliance on third parties
The apps have different arrangements and levels of reliance on third-parties for processing, automating, and storing data.
Otter told Freedom of the Press Foundation: “We do not sell or share your data with third parties, nor access your data without your explicit permission.” And its white pages suggest it only uses third-parties, such as AWS, to store data and not to process audio or transcript.
Descript uses Google Cloud Speech-to-Text, and relies on Rev for automatic and human transcription. Google says it deletes data from its servers when the transcription process is finished.
Rev does not rely on third-party companies to automate transcriptions, but does rely on 60,000 freelance manual transcriptionists. Rev said its transcriptionists sign confidentiality agreements and are blocked from downloading customer audio.
Temi does not use human transcriptionists or third party software to automate recordings, but Temi is operated by Rev.
“Files are transcribed by machines and are never seen by a human,” Temi says on its website.
Trint relies on third-party, cloud database MongoDB Atlas, and uses Transloadit for uploading and processing files, but Transloadit says it deletes data after 24 hours.
No transparency reports
None of the services Freedom of the Press Foundation evaluated produce transparency reports.
“There’s no way to know how frequently they receive or disclose user data responsive to law enforcement requests,” the report warned.
Descript does not offer two-factor authentication. Otter only offers two-factor authentication on its Business and Enterprise, and not its free accounts. Rev offers two-factor authentication to all its users. Trint does not offer two-factor authentication. Temi does not offer two-factor authentication.
Sharing data with foreign governments?
In addition to Freedom of the Press Foundation’s report, journalist Phelim Kine raised alarm bells in a Politico article about a survey prompt he received from Otter after interviewing a critic of the Chinese government.
While Otter assured the former Beijing-based correspondent that his communications had not been shared with foreign law enforcement, their response fell short of saying this could never happen.
“To be clear, unless we are legally compelled to do so by a valid United States legal subpoena, we will not ever share any of your data, including data files, with any foreign government or law enforcement agencies,” Otter’s public relations manager, Mitchell Woodrow said in an email on 15 February 2022.
The prompt was sent after Kine interviewed Mustafa Aksu, a US-based Uyghur human rights activist and constant target of phishing attacks.
“Hey Phelim, to help us improve your Otter’s experience, what was the purpose of this particular recording with titled ‘Mustafa Aksu’ created at ‘2021-11-08 11:02:41’?”
Three responses were offered: “Personal transcription,” “Meeting or group collaboration,” and “Other.”
An Otter representative originally denied the prompt was legitimate and told Kine to “not respond to that survey and delete it,” but the company later admitted to sending it to him.
Electronic Frontier Foundation, a not-for-profit that advocates for journalists' and activists’ protection against surveillance, verified Otter sent the survey after analysing Kine’s exchanges with the automated transcription app.
Otter said it was a “Speech Level Survey … requested by our engineering team.”
“For that particular conversation, the conversation’s title included personal information about someone [Aksu's name], which was referenced within the survey,” Allen Lai, an Otter service team member, told Kine via email on Jan. 14.
“We want to ensure [sic] you that we are not monitoring your account or content and that we referenced the title of the conversation for you to be able to recall that specific recording.”
Lai told Kine the survey had been discontinued.