Oct
18
…And thence, to Frankfurt
Filed Under Big Data, Blog, data analytics, healthcare, Industry Analysis, internet, Publishing, Reed Elsevier, Search, semantic web, STM, Workflow | Leave a Comment
The great Book Messe is the autumn opener for me. As showers and brown leaves gust across the huge fairground, I seized the hour for bier and knackwurst, and contemplated the future in the light of a kindly young woman having stood up on the Pendelbus to allow my ancient self, lame of leg and rheumy of eye, to sit down. Having concluded that this was evidence that the human race has hope, I had another bier.
Yet despite this blip, the highlight of the fair for me was not a book or a party, though plenty of both were in evidence, but an interesting conversation with a group who really knew what they were talking about at one of the International Friday conference sessions. Here, in the Production stream (how very strange it now seems to call data part of “Production”!) we did a session on Discoverability and Metadata. As speakers we had Jason Markos, Director of Knowledge Management at Wiley to get us started, followed by Timo Hannay, CEO of Macmillan Digital Science; Dr Sven Fund, CEO of De Gruyter; and, to keep us technologically honest, Andreas Blumauer, CEO of Vienna’s Semantic Web company. So, a mass of talent from whom came massive elucidation of what I take to be a critical developmental issue for STM today and the rest of the information marketplace tomorrow. The problem of knowledge. The problem that when we have solved the knowledge problem, will we ex-publishing groundlings still be needed?
Jason got us afloat in a very admirable way. As we move from a world of documents and segments of former documents (book, journal and article moving to lower levels of granularity – abstract, reference, citation) – so we eventually recognize that entity extraction and text enrichment become ways of interconnecting thoughts and ideas horizontally in knowledge structures that represent the discovery of new insights that were not effectively available in the years when we were searching documents for word matches. Once we are underlining meaning with a range of possibilities and allowing semantic web analysis to use knowledge of meaning in context to illuminate underlying thinking (along with what is not on the page but is implied by what we have read or written), then we are into a Knowledge game which moves past content and beyond data and into some very new territory.
Companies like Wiley and Macmillan and Elsevier and Springer will exploit this very effectively using their own content. In disciplines like chemistry, building knowledge stores and allowing researchers to archive both effective and failed discovery work will become commonplace. Extended metadata will take us beyond the descriptive towards recording analytics and following knowledge pathways. People like Timo will create the knowledge – based infrastructure that allows this to become a part of the workflow of science. Sven will keep our feet on the ground by ensuring that we do not try to sell concepts before users are ready, and Andreas will help us to master the disciplines of the semantic web – and then, just as I was padding round the audience with a microphone picking up some really interesting questions, our little theatre was over, we could strut and fret no more and the audience could escape from Frankfurt’s economy drive of the year – no wifi in conference spaces!
So I was left on the Pendelbus and under the biergarten tarpaulin to ponder the impact of all of this. In the self-publishing future, when scholars publish straight to figstore and F1000 does post-publication peer review, the data to be organized will have to be collected. Indeed current Open Access has already begun the fragmentation. As knowledge structures grow, some scholars will demand that, except in extreme circumstances, they will never see primary text, but work only on advanced, knowledge-infused metadata. Further, that metadata will have to cover everything relevant to the topic. Will the lions and lambs of Elsevier and Wiley, Springer and Macmillan, lie down with each other and co-operate, or will external metadata harvesting become a new business, over the heads of primary content players. And will it be professional bodies like ACS or RCS who do this – or technology players? Knowing where everything is, being able to visualize its relationships with everything else, and being able to organize universal knowledge searching may be a different business from the historical publishing/information model. And the metadata that matters in this context? Who owns it may be a matter of the context in which it is collected. Certainly owning the underlying content would seem to give that owner very few rights to ownership of metadata derived from enquiries and research that included it, yet here I predict future sturm and drang as publishers seek to own and control the extension of metadata into the knowledge domain. And if these are autumnal topics, what can winter be like when it comes?