So have we all got it now? When our industry (the information services and content provision businesses, sometimes erroneously known as the data industry) started talking about something called Big Data, it was self-consciously re-inventing something that Big Science and Big Government had known about and practised for years. Known about and practised (especially in Big Secret Service; for SIGINT see the foot of this article) but worked upon in a “finding a needle in a haystack” context. The importance of this only revealed itself when I found myself at a UK Government Science and Technology Facilities Council at the Daresbury Laboratory in he north of England earlier this month. I went because my friends at MarkLogic were one of the sponsors, and spending a day with 70 or so research scientists gives more insight on customer behaviour than going to any great STM conference you may care to name. I went because you cannot see the centre until you get to the edge, and sitting amongst perfectly regular normal folk who spoke of computing in yottaflops (processing per second speeds of 10 to the power of 24) as if they were sitting in a laundromat watching the wash go round is fairly edgy for me.

We (they) spoke of data in terms of Volume, Velocity and Variety, sourced from the full gamut of output from sensor to social. And we (I) learnt a lot about the problems of storage which went well beyond the problems of a Google and a Facebook. The first speaker, from the University of Illinois, at least came from my world: Kalev Leetanu is an expert in text analytics and a member of the Heartbeat of the World Project team. The Great Twitter Heartbeat ingests Twitter traffic, sorts and codes it so that US citizens going to vote, or Hurricane Sandy respondents, can appear as geographical heatmaps trending in seconds across the geography of the USA. The SGI UV which did this work (it can ingest the printed resources of the Library of Congress in 3 seconds) linked him to the last speaker, the luminous Dr Eng Lim Goh, SVP and CTO at SGI, who gave a magnificent tour d’horizon of current computing science. His YouTube videos are as wonderful as the man himself (a good example is his 70th birthday address to Stephen Hawking, his teacher, but also look at (http://www.youtube.com/watch?v=zs6Add_-BKY). And he focussed us all on a topic not publicly addressed by the information industry as a whole: the immense distance we have travelled from “needle in a haystack” searching to our current pre-occupation with analysing the differences between two pieces of hay – and mapping the rest of the haystack in terms of those differences. For Dr Goh this resolves to the difference between arranging stored data as a cluster of nodes to working in shared memory (he spoke of 16 terabyte supernodes). As the man with the very big machine, his problems lie in energy consumption as much as anything else. In a process that seems to create a workflow that goes Ingest > Store and Organize > Analytics > Visualize (in text and graphics – like the heatmaps) the information service players seem to me to be involved at every point, not just the front end.

The largest data sourcing project on the planet was represented in the room (The SKA, or Square Kilometre Array, is a remote sensing telemetry experiment with major sites in Australia and South Africa). Of course, NASA is up there with the big players, and so are the major participants in cancer research and human genomics. But I was surprized by how Big the Big Data held by WETA Data (look at all the revolutionary special effects research at http://www.wetafx.co.nz/research) in New Zealand was, until I realised that this is a major film archive (and NBA Entertainment is up there too on the data A List) This reflects the intensity of data stored from film frame images and their associated metadata, now multiplied many times over in computer graphics – driven production. But maybe it is time now to stop talking about Big Data, the term which has enabled us to open up this discussion, and begin to reflect that everyone is a potential Big Data player. However small our core data holding may be compared to these mighty ingestors, if we put proprietory data alongside publicly sourced Open Data and customer-supplied third party data, then even very small players can experience the problems that induced the Big Data fad. Credit Benchmark, which I mentioned two weeks ago, has little data of its own: everything will be built from third party data. The great news aggregators face similar data concentration issues as their data has to be matched with third party data.

And I was still thinking this through when news came of an agreement signed by MarkLogic (www.marklogic.com) with Dow Jones on behalf of News International this week. The story was covered in interesting depth at http://semanticweb.com/with-marklogic-search-technology-factiva-enables-standardized-search-and-improved-experiences-across-dow-jones-digital-network_b33988 but the element that interested me and which highlights the theme of this note concerns the requirement not just to find the right article, but to compare articles and demonstrate relevance in a way which only a few years ago would have left us gasping. Improved taxonomic control, better ontologies and more effective search across structured and unstructured data lie at the root of this, of course, but do not forget that good results at Factiva now depend on effective Twitter and blog retrieval, and effective ways of pulling back more and more video content, starting with You Tube. The variety of forms takes us well beyond the good old days of newsprint, and underline the fact that we are all Big Data players now.

Note: Alfred Rolington, formerly CEO at Janes, will publish a long-awaited book with OUP on “Strategic Intelligencein the Twenty First Century” in January which can be pre-ordered on Amazon at http://www.amazon.co.uk/Strategic-Intelligence-21st-Century-Mosaic/dp/0199654328/ref=sr_1_1?s=books&ie=UTF8&qid=1355519331&sr=1-1. And I should declare, as usual, that I do work from time to time with the MarkLogic team, and thank them for all they have done to try to educate me.

Is that an early Christmas Carol of Consolidation and/or Consolation I hear in the air. As CBS/Simon and Schuster Books prepares to surrender to the breathless embrace of that ardent wooer, Rupert Murdoch (Harper Collins), the UK is entranced by the appearance of David Montgomery as the saviour of the regional press. Despite the remarks made here in “Monty’s Flagging Circus” two weeks ago, it seems only fair to warn the brave man of the possible pitfalls that lie ahead and give him any advice and guidance that may be available. Media casualties help no one, and people like me who have spent a lifetime in media should do more than hop up and down on the sidelines prophesying doom. So here goes:

Dear David (everyone, me included, seems to call you Monty without ever asking, so I will try to be more correct in future),

Congratulations on the launch of the Local Worlds business, and upon your statement re-emphasizing your belief that people will always want local news and information. I have written about your intentions since they were first rumoured, but since those statements might be seen as a bit negative, I wanted to write to you publicly to say that I wish you every success, and would like to contribute something of my own amongst the more tangible contributions of your other stakeholders. You see, in 1996 I played a role as a strategic consultant in helping Trinity, Newsquest and Northcliffe to establish a joint web branding for local content called “This is…”, and, experimentally, and based in my own offices, began work on developing a service for concentrating all of the regionals classified advertising called ADHunter, directed by Marlen Roberts of Northcliffe. A year or so later this was relaunched in Hammersmith, by a brilliant manager called Jonathan Turpin, as Fish4…Homes, Cars, Jobs etc. It still exists, owned now by Trinity Mirror. From its inception and for the next four years, I was its non-executive chairman, refereeing a board of directors comprised of the CEOs of each of the major UK local newspaper groups, who were the shareholders and content contributors. Johnston Press joined twice – but also left twice. Sometimes the CEOs did not show: how well I recollect a substitute turning up for one of them, and volunteering, just after the minutes had been signed, “My mandate for this meeting is to say “NO” “!

I rehearse this escapade on the nursery slopes of British attempts to get the media to respond to a networked world simply to say that I have some knowledge and sympathy for the world through which you are now moving. But I started this letter to offer 5 points of advice. Here they are:

1.  Investors. They are your worst enemy. Having investors who want a return and don’t mind how you get it is one thing: having investors who want results, but not results that deteriorate the quality of their other businesses is really tough. Is London Local as far as your investors are concerned? Will Trinity Mirror compete with what you do? Boards that cannot make decisions make chaos, and then, if you could get Newsquest or even Johnston, or Archant, to invest in you, compound the rivalry, suspicion and eventual stalemate.

2.  Editors are a real liability when it comes to change. They are above all committed to the “push” world. They want to select and define. But you cannot let that happen, since, online, you cannot define “local”. Do you mean my village, this town, this suburb, this county or, indeed, this region? People define local for themselves, and “pull” it to their access point. While I agree that we all want local news and information, you have to provide an interface through which they can focus – on a smartphone, or a tablet, but certainly not primarily on paper.

3.  Journalists are too expensive. Many, if not most, of your stories will cover local football , the Women’s Institute meeting or the town council. Look at the way in which excellent artificial intelligence software is now formatting and templating factual input and archived recall to create the news: a prime example is www.narrativescience.com which builds automated stories for newspapers and B2B magazines. Save your journalists for so-called investigative reporting where you can make an impact; once the editors have gone and the journalists diminished and printing severely cut back to a national centre you may come by a cost base that suits the circumstances in which you now find yourself.

4.  Relaunch as an online service. Call it LocalWorld if you like. Allow users to set their own limits, by content subject as well as geography. Make it a content experience that people will pay for and add their own content to it – and they will – not an advertising experience that delays and distracts them. Make it Local Google with no ads: and, as Google gets into predictable difficulties as a local provider, use your increasingly trusted pure content brands (I know you will use the old newspaper brands in the background to suggest this trust) for lead generation and customer referral. Get it right and you could end up with a local community presence, under the radar of Facebook. Make local a place to go for education, or to recommend (and then) buy eBooks or music if you like, but not for conventional click-through advertising. But your investors must give you time to sort this.

5.  Watch the winners and losers. At the moment Axel Springer and Schibsted are gaining ground with a pure digital classifieds play. Could work for you, but Trinity wouldn’t like it. Keep content and classifieds apart though – they represent different channels in a networked world. The terror to be avoided at all costs is trying to drag the newspaper online and make it work in trad business model terms. Time to turn off the life support systems: people do want local news – but they want it on their own terms.

Oh, yes. And keep having lunch with that nice Ashley Highfield chap over at Johnston Press. When you get a technology focus which does for local news what his iPlayer did for Broadcast television, then you and he will want to proliferate it as widely as you can across the localities of Britain, and shared tech investment makes more sense than competing standards. All this can be done, but not of course if the business plan is to simply cut costs and reheat the margins of existing newspapers ahead of their eventual obliteration. The newspaper at Manassas Junction shuttered last week, despite being saved by Warren Buffet, no less. Lets make local work, but lets make it work on the terms that local people want.

Best wishes for your new venture.

David Worlock

 

 

 

« go backkeep looking »