The great Book Messe is the autumn opener for me. As showers and brown leaves gust across the huge fairground, I seized the hour for bier and knackwurst, and contemplated the future in the light of a kindly young woman having stood up on the Pendelbus to allow my ancient self, lame of leg and rheumy of eye, to sit down. Having concluded that this was evidence that the human race has hope, I had another bier.

Yet despite this blip, the highlight of the fair for me was not a book or a party, though plenty of both were in evidence, but an interesting conversation with a group who really knew what they were talking about at one of the International Friday conference sessions. Here, in the Production stream (how very strange it now seems to call data part of “Production”!) we did a session on Discoverability and Metadata. As speakers we had Jason Markos, Director of Knowledge Management at Wiley to get us started, followed by Timo Hannay, CEO of Macmillan Digital Science; Dr Sven Fund, CEO of De Gruyter; and, to keep us technologically honest, Andreas Blumauer, CEO of Vienna’s Semantic Web company. So, a mass of talent from whom came massive elucidation of what I take to be a critical developmental issue for STM today and the rest of the information marketplace tomorrow. The problem of knowledge. The problem that when we have solved the knowledge problem, will we ex-publishing groundlings still be needed?

Jason got us afloat in a very admirable way. As we move from a world of documents and segments of former documents (book, journal and article moving to lower levels of granularity – abstract, reference, citation) – so we eventually recognize that entity extraction and text enrichment become ways of interconnecting thoughts and ideas horizontally in knowledge structures that represent the discovery of new insights that were not effectively available in the years when we were searching documents for word matches. Once we are underlining meaning with a range of possibilities and allowing semantic web analysis to use knowledge of meaning in context to illuminate underlying thinking (along with what is not on the page but is implied by what we have read or written), then we are into a Knowledge game which moves past content and beyond data and into some very new territory.

Companies like Wiley and Macmillan and Elsevier and Springer will exploit this very effectively using their own content. In disciplines like chemistry, building knowledge stores and allowing researchers to archive both effective and failed discovery work will become commonplace. Extended metadata will take us beyond the descriptive towards recording analytics and following knowledge pathways. People like Timo will create the knowledge – based infrastructure that allows this to become a part of the workflow of science. Sven will keep our feet on the ground by ensuring that we do not try to sell concepts before users are ready, and Andreas will help us to master the disciplines of the semantic web – and then, just as I was padding round the audience with a microphone picking up some really interesting questions, our little theatre was over, we could strut and fret no more and the audience could escape from Frankfurt’s economy drive of the year – no wifi in conference spaces!

So I was left on the Pendelbus and under the biergarten tarpaulin to ponder the impact of all of this. In the self-publishing future, when scholars publish straight to figstore and F1000 does post-publication peer review, the data to be organized will have to be collected. Indeed current Open Access has already begun the fragmentation. As knowledge structures grow, some scholars will demand that, except in extreme circumstances, they will never see primary text, but work only on advanced, knowledge-infused metadata. Further, that metadata will have to cover everything relevant to the topic. Will the lions and lambs of Elsevier and Wiley, Springer and Macmillan, lie down with each other and co-operate, or will external metadata harvesting become a new business, over the heads of primary content players. And will it be professional bodies like ACS or RCS who do this – or technology players? Knowing where everything is, being able to visualize its relationships with everything else, and being able to organize universal knowledge searching may be a different business from the historical publishing/information model. And the metadata that matters in this context? Who owns it may be a matter of the context in which it is collected. Certainly owning the underlying content would seem to give that owner very few rights to ownership of metadata derived from enquiries and research that included it, yet here I predict future sturm and drang as publishers seek to own and control the extension of metadata into the knowledge domain. And if these are autumnal topics, what can winter be like when it comes?

It was the second afternoon of the last EASDP annual conference, last Friday in Amsterdam. The Big Business of the day was said to be over, in that at their General Council EASDP, representing Europe’s directory publishers, had voted to merge with EIDQ, Europe’s directory enquiries services. Sic transit the glory of the yellow page players. I was sad – EASDP in its heyday ran some of the most entertaining meetings in Europe. I was happy, since I had lost a night’s sleep en route to Amsterdam and was approaching going home time. And then he threw this thunderbolt across the stage and rocked us all back in our seats, “You may never visit a native website again!”

The line had added impact in that it came from a former CEO of Experian’s B2B services, Phil Cotter. He was speaking for BIIA and his own consulting interests, and addressing the issues posed by predictive analytics. And he was skilfully piling up the arguments around a machine-to-machine future, the role of the intelligence in the network, the ability to track and map our activities as predicted by the past activities of ourselves and others like us. And suddenly, all of the chat about behavioural targeting and the future of advertising on the web crumbled into dust for me. The website now becomes a totally different proposition. This is not the display table, advertising driven, designed to bring users to your goods and services. This is the storehouse of your advanced metadata and this the key to your discoverability. Mostly you will get discovered by machines, so you need to be very aware of how to tell them who you are and what you are about in language they can understand and use. As it happens I am moderating a session at the Frankfurt Book Fair (http://en.book-fair.com/fbf/programme/calendar_of_events/detail.aspx?PageRequestId=6ea4655a-3dd4-4209-872a-fcd3a6240b02&a1850834-d682-44a4-9b98-1ff33a3bcb5c=72b77c9a-c2af-4cca-a94f-268d1d3987ed) where some of the best brains in STM will address this issue: yet Phil makes me realize that this is not just an issue for the advance guards of science and technology publishing: it is about to crash, with frightening speed, on your shore as well.

Later in the session, as Phil was explaining the way in which the LAPD use predictive analysis to create patrol patterns for police cars, a hand shot up. “If patterns of crime exist so that you can say where the next lookalike crime will occur, after a few nights the cars will be entirely in the wrong places” Phil explained gently that this was why the analysis was run every day, and thereby gives me a second insight into what is happening. We are still thinking at our own speed about real world cycles of change. It does not matter to the machine that we are so slow to process: predictive analysis can be run repeatedly to catch nuanced change in activity if that activity is important enough to justify it. Then again, most of the apps that run predictive analysis are going to lodged, for consumers and for commercial users, on the advanced smartphones of the future. There the emphasis is likely to be upon rapid decision-making in an increasingly time-constrained society. Predictive analysis only needs to be “right enough” to allow a decision to be made.

And, of course, intelligent predictive analytics software is everywhere you look. SAP and SAS have history here: IBM and Oracle have serious offerings: TIBCO and Orange have activity here. But have a look at WEKA from Waikato in New Zealand (http://www.cs.waikato.ac.nz/~ml/weka) for some fascinating stuff on machine learning, and kick the tyres of specialist players like Foresee (www.foresee.com) or Absolutdata.com. This is a fast-changing world and the time between research lab and application grows ever shorter. Meanwhile I heard a good interview on the radio last week. An independent television producer was complaining that the advertising muscle of major agencies like WPP was being used to compel the co-financing of the TV they wanted – no shared deal equals no advertising was the implication. And we were expected to disapprove of the power of advertising being used in this way. But what if the agencies simply have a realistic view of the future of advertising and want their business to migrate to different places in the value chain. They will discover in time that content production is not the route to riches, but maybe they have already worked out that advertising is unlikely to go to the networks without being wholly changed – by predictive analysis, by recommendation engines, by community buying and countless other network driven expedients. Once again, the power migrates in the network to the user.

Then Laurie Kaye, now at Shoosmiths as their lead man in media legal pyrotechnics, came on stage and told us about the “right to be forgotten”. Not a good day for advertising and lead generation – in a conference dedicated to advertising-based directories and marketing services. The world is moving too fast to allow for the realtime re-calibration of the trade associations.

« go backkeep looking »