Automated writing

Back in 2014, the Los Angeles Times published a report about an earthquake three minutes after it happened. This feat was possible because a staffer had developed a bot (a software robot) called Quakebot to write automated articles based on data generated by the US Geological Survey. Today, AIs write hundreds of thousands of the articles that are published by mainstream media outlets every week.

At first, most of the Natural Language Generation (NLG) tools producing these articles were provided by software companies like Narrative Science. Today, many media organisations have developed in-house versions. The BBC has Juicer, the Washington Post has Heliograf, and nearly a third of the content published by Bloomberg is generated by a system called Cyborg. These systems start with data – graphs, tables and spreadsheets. They analyse these to extract particular facts which could form the basis of a narrative. They generate a plan for the article, and finally they craft sentences using natural language generation software.

These systems can only produce articles where highly structured data is available as an input, such as video of a football match, or spreadsheet data from a company’s annual return. They cannot write articles with flair, imagination, or in-depth analysis. As a result, they have not rendered thousands of journalists redundant. Instead they have sharply increased the number of niche articles being written.

As Kenn Cukier, senior editor at The Economist, puts it: “We can’t be precious about this: it’s about what is best for the public, not what is best for journalists. We didn’t cling to the quill in the age of the typewriter, so we shouldn’t resist this either. It’s a scale play serving niche markets that wouldn’t be cost-effective to reach otherwise.”

When computers have opinions

Cait O’Riordan is a former BBC journalist, and is now chief product and information officer at the Financial Times. She is adamant that article generating systems will not replace human journalists in the foreseeable future: “human audiences want to read opinion and analysis, not just structured data processed by an algorithm”. She does admit that systems like IBM’s Project Debater generate a pretty good simulation of an opinion. She argues, “there’s already too much opinion on offer”, but admits that in the future, the “IBM-generated opinion could become significant.” Martin Wolf and Yuval Harari are unlikely to be dethroned by these systems, but there is a genuine question about whether new thought leaders will find it harder to get established.

Another interesting possibility is that articles may become tailored for particular niche audiences, and ultimately for each of us individually. For instance, an announcement by a research organisation that inflating your car tyre correctly could reduce your spend on petrol by 7% could be tailored to take into account your particular car, the number of miles you drive each week, and even your style of driving, assuming that you and your car have made that information accessible. “The Daily Me” is an idea that has done the rounds for years, and it exists in rudimentary form in the shape of Google News and similar services, which curate the selection of articles that you see. The idea of individual articles being tailored at scale to individual readers remains in the future.

AIs spotting trends and bias

Another way the FT is using AI is to spot trends within economies and within markets. The FT’s own content lends itself to this kind of analysis, and it is looking at using other data sources too.

The FT is also using bots to help it reduce the bias within its own output. A system called Janetbot – named after Janet Yellen, former chair of the Federal Reserve – analysed the gender ratio of the faces appearing in the paper. It was trained on an industry-wide sample of photos, but its use had to be paused earlier this year, when it became apparent the sample was not racially diverse enough to detect gender reliably.

Subscription model

The FT, of course, is an unusual newspaper. It was one of the first to introduce a paywall successfully, when it became apparent that the old model of display and classified ads were not going to work so well in the new digital world. In recent years, it has moved heavily towards the subscription revenue model. The FT’s Innovation Editor, John Thornhill, relates the adage that there are two kinds of people who will pay for accurate news: investors and spies. The former want to avoid investing in the wrong company, and the latter want to avoid ordering a drone strike on the wrong target. He assumes it is mainly the former who subscribe to the FT, but you never know with spies…

Because of the importance of accuracy, the FT is always on the lookout for new ways to spot and correct errors. Its systems track reader engagement with articles, and monitor their feedback. Thornhill says this is like having an AI mark your homework in real time: valuable, if a little scary. But the best error detection protocol remains crowdsourcing – the hive mind of FT readers. “If there’s a mistake in one of our articles, they pick it up in minutes.”

Click-bait

Another, very different type of news media which has been successful in the digital world is exemplified by Buzzfeed and MailOnline, the online version of the Daily Mail, which is reckoned to be the most-read English-language newspaper website in the world, but which has been banned as a reference source by Wikipedia. These sites are based on programmatic advertising: responding to data about readers’ appetites with “listicles”, and articles about celebrities which will prompt them to click on Facebook ads, and generate cost-per-click (CPC) revenues. Whole media empires have been built on figuring out how to tickle Facebook’s erogenous zones.

But this is a precarious business. If you generate all your content by responding to data, you risk getting stuck in yesterday’s news, with all your articles being about the Kardashians, Trump, and the virus. You miss the quirky, the creative, and the original – as if Hollywood was to fall back on making endless Avenger movies. Oh…

You are also hostage to Facebook, which frequently changes its CPC algorithms to prevent them being gamed. And post-Cambridge Analytica, Facebook is sharply reducing its reliance on news as a way to win and keep its audiences, and focusing more on user-generated personal news and gossip.

Marketing analytics and RPA

Like any well-run company that intends to remain in business, the FT is using data analytics to improve the way it sells. In the past, a student in a Birmingham bedsit would receive the same subscription offer as the CEO of a tech giant in Silicon Valley. Today the company uses machine learning to experiment with offering more appropriate packages to each. And in any large organisation, there are costs to be saved by automating various tasks and activities. Robotic Process Automation (RPA) can speed up and improve the production of market reports, the fleshing out of customer databases, and the handling of responses to customer enquiries. Giving robotic jobs to robots is the bread and butter of management today.

Related Posts