Text Analytics Meets Publishing

I’ve been writing about text analytics for a number of years, now. Many of my blogs have included survey findings and vendor offerings in the space.  I’ve also provided a number of use cases for text analytics; many of which have revolved around voice of the customer, market intelligence, e-discovery, and fraud.  While these are all extremely valuable, there are a number of other very beneficial use cases for the technology and I believe it is important to put them out there, too.

Last week, I spoke with Daniel Mayer, a product-marketing manager, at TEMIS about the publishing landscape and how text analytics can be used in both the editorial and the new product development parts of the publishing business.  It’s an interesting and significant use of the technology.

First a little background.  I don’t believe that it comes as a surprise to anyone that publishing, as we used to know it has changed dramatically.  Mainstream newspapers and magazines have given way to desktop publishing and the Internet as economics have changed the game.  Chris Anderson wrote about this back in 2004, in Wired, in an article he called “The Long Tail” (it has since become a book).  Some of the results include:

  • Increased Competition.  There are more entrants, more content and more choice on the Internet and much of it is free.
  • Mass market vs. narrow market.  Additionally, whereas the successful newspapers and magazines of the past targeted a general audience, the Internet economically enables more narrow appeal publications.  
  • Social, Real time.  Social network sites, like twitter, are fast becoming an important source of real time news. 

All of this has caused mainstream publishers to rethink their strategies in order to survive.  In particular, publishers realize that content needs to be richer, interactive, timely, and relevant.

Consider the following example.  A plane crashes over a large river, close to an airport.  The editor in charge of the story wants to write about the crash itself, and also wants to include historical information about the cause of plane crashes (e.g. time of year, time of day, equipment malfunction, pilot error, etc based on other plane crashes for the past 40 years) to enrich the story.  Traditionally, publishers have annotated documents with key words and dates.   Typically, this was a manual process and not all documents were thoroughly tagged.  Past annotations might not meet current expectations. Even if the documents were tagged, they might have been tagged only at a high level (e.g. plane crash), so that the editor is overwhelmed with information.   This means that it might be very difficult her to find similar stories, much less analyze what happened in other relevant crashes.  

Using text analytics, all historical documents could be culled for relevant entities, concepts, and relationships to create a much more enriched annotation scheme.  Information about the plane crash such as location, type of planes involved, dates, times, and causes could be extracted from the text.  This information would be stored as enriched metadata about the articles and used when needed.  The Luxid Platform offered by TEMIS would also suggest topics close to the given topic.  What does this do? 

  • It improves the productivity of the editor.  The editor has a complete set of information that he or she can easily navigate.  Additionally, if text analytics can extract relationships such as cause this can be analyzed and used to enrich a story.
  • It provides new opportunities for publishers.  For example, Luxid would enable the publisher to provide the consumer with links to similar articles or set up alerts when new, similar content is created, as well as tools to better navigate data or analyze it (this might be used by fee based subscription services).  It also enables publishers to create targeted microsites and topical pages, which might be of interest to consumers.

Under many current schemes, advertisers pay online publishers.  Enhancing navigation means more visits, more page views, and a more focused audience, which can lead to more advertising revenue for the publisher.  Publishers, in some cases, are trying to go even further, by transforming readers into sales leads and receiving a commission from sales. There are other models that publishers are exploring, as well.  Additionally, text analytics could enable publishers to re-package content, on the fly (called content repurposing), which might lead to additional revenue opportunities such as selling content to brand sponsors that might resell it.  The possibilities are numerous.

I am interested in other compelling use cases for the technology.

1 thought on “Text Analytics Meets Publishing”

  1. Thanks for your insight Fern.

    Text analytics and publishing is a great fit and I think your revenue opportunity point is timely given recent discussions around the future of online content and Rupert Murdoch’s push to sell newspaper content.

    Publishers must be thinking creatively these days


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s