Outsourcing My Brain II

Filed Under B2B, Big Data, Blog, data analytics, Industry Analysis, internet, Publishing, Search, semantic web, STM, Uncategorized, Workflow | 1 Comment

Now, are you ready for this? I am not sure that I am, but I feel honour bound, having started a discussion last month under this heading about Internet of Things/Internet of Everything (IoT/IoE), to finish it by relating it back to the information marketplace and the media and publishing world. It is easy enough to think of the universal tagging of the world around us as a revolution in logistics, but surely it’s only effect cannot be to speed the Amazon drone ever more rapidly to our door? Or to create a moving map of a battlefield which relates what we are reading about in a book to all of the places being mentioned as we turn the pages? Or create digital catalogues as every book is tagged and can respond by position and availability!

You are right: there must be more to all of this. So let us start where we are now and move forward with the usual improbable claims that you expect to read here. Let’s begin with automated journalism and authorship, which, when I wrote here about the early work of Narrative Science and the Hanley Wood deal, was in its infancy, and then came Automated Insights and the Wordsmith package (automatedinsights.com). Here, it seemed to me, were the first steps in replacing the reporter who quarries the story from the press release with a flow of standardised analytics which could format the story and reproduce it in the journal in question just as if it had been laboriously crafted by Man. End result is a rapid change in the newspaper or magazine cost base (and an extension to life on Earth for the traditional media?).

I no longer think this will be the case. As with the long history of the postponed glories of Artificial Intelligence itself, by the time fully automated journalism arrives, most readers will be machines as well as most writers, in fields as diverse as business news and sports reporting and legal informatics and diagnostic medicine and science research reporting. Machine 2 Me will be rapidly followed by real M2M – Machine to Machine. The question then sharpens crudely: if the reporting and analysis is data driven and machine moderated, will “publishing” be an intermediary role at all? Or will it simply become a data analysis service, directed by the needs of each user organisation and eventually each user? So the idea of holding content and generalizing it for users becomes less relevant, and is replaced by what I am told is called “Actionable Personalization”. In other words, we move rapidly from machine driven journalism to personalised reporting which drives user workflows and produces solutions.

Let’s stumble a little further along this track. In such a deeply automated world, most things that retain a human touch will assume a high value. Because of their rarity, perhaps, or sometimes because of the eccentric ability of the human brain to retain a detail that fails the jigsaw test until it can be fitted into a later picture. We may need few analysts of this type, but their input will have critical value. Indeed, the distinguishing factors in discriminating between suppliers may not be the speed or capacity or power of their machinery, but the value of their retained humans who have the erratic capacity to disrupt the smooth flow of analytical conclusion – retrospectively. Because we must remember that the share price or the research finding or the analytic comparison has been folded into the composite picture and adjustments made long before any human has had time to actually read it.

Is all this just futurizing? Is there any evidence that the world is beginning to identify objects consistently with markers which will enable a genuine convergence of the real and the virtual? I think that the geolocation people can point to just that happening in a number of instances, and not just to speed the path of driverless cars. The so-called BD2K iniatives feature all sort of data-driven development around projects like the Neuroscience Information Framework. Also funded by the U.S. government, the Genbank initiatives and the development of the International Nucleotide Sequence Database Collaboration, point to a willingness to identify objects in ways that combine processes on the lab workbench with the knowledge systems that surround them. As so often, the STM world becomes a harbinger of change, creating another dimension to the ontologies that already exist in biomedicine and the wider life sciences. With the speed of change steadily increasing these things will not be long in leaving the research bench for a wider world.

Some of the AI companies that will make these changes happen are already in movement, as the recent dealings around Sentient (www.sentient.ai) make clear. Others are still pacing the paddock, though new players like Context Relevant (www.contextrelevant.com) and Scaled Inference (https://scaled inference.com) already have investment and valuations which are comparable to Narrative Science. Then look at the small fast growth players – MetaMind, Vicarious, Nara or Kensho – or even Mastodon C in the UK – to see how quickly generation is now lapping generation. For a decade it has been high fashion for leading market players in information marketplaces to set up incubators to grow new market presence. We who have content will build tools, they said. We will invest in value add in the market and be ready for the inevitable commoditization of our content when it occurs. They were very right to take this view, of course, and it is very satisfying to see investments like ReadCube in the Holtzbrinck/Digital Science greenhouse, or figshare in the same place, beginning to accelerate. But if, as we must by now suspect, the next wave to crash on the digital beach is bigger than the last, then some of these incubations will get flooded out before they reach maturity. Perhaps there was no time at which it is more important to have a fixed focus on 6 months ahead and three years. The result will be a cross-eyed generation, but that may be the price for knowing when to disinvest in interim technology that may never have time to flower.

Feb

4

Barely a Whimper

Filed Under B2B, Big Data, Blog, data analytics, Financial services, healthcare, Industry Analysis, internet, Search, semantic web, Uncategorized, Workflow | 1 Comment

British public policy on data availability for commercial re-use died a sad, whimpering, undignified death yesterday. No one noticed. Years of political neglect, and masterful inactivity by the civil service, meant that it had long since ceased to be a public topic. The idea, borne in the Community in Brussels, enshrined in European Directives, cleverly headed by the Brits in terms of passing secondary legislation and apparently wanting to be best in class at data sharing (how often do we see that the way to best inhibit change is to assume its leadership!) probably fell mortally ill some years ago, when the current UK coalition government assumed office, but we only really woke up to the reality yesterday, when the government abolished APPSI – the Advisory Panel on Public Sector Information – to mark the fact (http://www.parliament.uk/business/publications/written-questions-answers-statements/written-statement/Commons/2015-02-03/HCWS245/). The idea once engendered that the swiftest way to move our society into the inevitability of the networked world was for government to share data with the private sector, to stimulate national information industries by so doing, and to reap a result in wealth creation, employment and a widening tax base is accepted from the US to China. (Think only of the Peoples Bank working with Alibaba et al to keep non-Chinese credit rating at bay.) It is widely accepted in Europe, and many countries have now moved past the UK in this area. Sadly, poor old Britain caught between blind politicians of all shades and civil servants who saw information as power – to be retained at all costs – has lost out on all fronts.

The bitterness here is personal. As a campaigner and lobbyist I fought the good fight for a decade to get the European legislation through, and when it passed I thought, as I took a seat as a founder member of APPSI, that the battle was largely done. How foolish was that! The high ground of British public information was then – and still is – in the hands of state – owned monopolies whose hugely restrictive licences and high fees have proved a barrier to letting a hundred information flowers bloom, let alone a thousand. The UK has the energy – go and look at the thousands of digital start-ups in Shoreditch if you doubt it – and it has the financial investment muscle. But the catalytic element – being able to mash cheaply available, easily licensed data with third party and proprietary content – is wholly missing. And why is that? Because we have Ordnance Survey, HM Land Registry and countless other public monopolies who have the protection of the Treasury and of departments of state still seeking to flesh out a power base and avoid financing the collection and re-use of public information – that is, information collected at the taxpayer’s expense to perform a public duty enshrined in statute – in a proper networked world manner. When the history is written it will be found that Ordnance Survey on its own has been one of the greatest barriers to change in this sector. And if you think this overstates the issue, reflect that this country, which is about to license fracking for shale gas amidst fears of subsidence, still does not universally license the data which would show you whether your home was in danger of subsidence from historical coal mining. For that, you must go to an office in the North of England, pay a ludicrous fee, sit in a search room armed only with paper and pencil and have a search done. And this useless monopoly is a fiefdom of BIS, Britain’s department for Business!

But do not, whatever you do, blame BIS. They are highly attuned to the importance of public data. So highly attuned that when they privatized the British Post Office in this government, they sold it with the databases containing the post (Zip) codes within it. Sold this data, unpriced, not as a separate entity but lumped in with the postboxes and the delivery vans. Turned a public monopoly into – a private one. No special conditions attached to licencing. And then said afterwards that they did not realise what they had done! With people like this in charge of public policy, an Advisory Panel was certainly redundant. Quite superfluous. Almost embarrassing. Just think what such a panel make of the privatization of HM Land Registry. This was halfway down the slipway last year when the politicians lost their bottle, so it was withdrawn at the last moment. Given the way these public guardians treat public data, it would have made little difference if that became a private entity as well, but at least in that instance there was a chance of defining access conditions and securing standard licensing terms in the course of its change of status. As it is, the Shareholder Executive (designed to protect the public equity in these agencies – why does government always work in complete opposite directions to the intention) and the old villain, HM Treasury, work brilliantly together to fend off the public interest and preserve what once was in the face of what might be.

By now you think you are listening to a lunatic on a cold night shouting at the moon. So let me end with the sensible voice of the leading and authoritative academic commentator in this field, Bob Barr: “In the UK we have ended up with a lazy, counter-productive, business model based on holding public data hostage wherever possible, maximising the short term return from users that can derive the highest value, and pay the highest price, and diligently preventing the maximization of use in order to protect the monopoly rents that the high value users will pay.

However cogent, or otherwise the arguments from APPSI, ODUG and many lobbying and pressure groups before us, and no doubt after us, have been, the hidden hands of the Treasury, and its wicked offspring in ShEx (the Shareholder Executive), have prevailed in private. Their arguments are never publicly articulated. They do not publicly assess the costs and benefits of their approach across the economy as a whole but they have the ear of ministers in governments of all colours, no doubt egged on by the custodians of the data. This advice appears to count for much more than any academic, expert advisory or consultancy arguments articulated in public.”

Now government, which never implemented the data policy and has no policy of its own except no change, has shut down the source of advice. Not just a sad day for this debate, but a sad one for entrepreneurial Britain.

Feb

24

Outsourcing My Brain II

Feb

4

Barely a Whimper

Search

Recently Written

Categories

Archives

Blogroll

Links

Share & Subscribe

Admin