Jul
23
The End Of The Beginning Of The End?
Filed Under Artificial intelligence, Blog, data analytics, healthcare, Industry Analysis, internet, machine learning, Publishing, Reed Elsevier, semantic web, STM, Workflow | Leave a Comment
Is this Elsevier’s “music industry” moment? As more news emerges of German academics denied continuing access to journals, while Projekt Deal talks in Germany appear becalmed, there will certainly be anti-commercial publishing opinion in academe that hopes so. The whole debate on German access to Elsevier looks more and more like Britain’s Brexit talks, with one party in each case stating its minimum terms and not seeing any reason to settle for less, while the other reiterates “final positions” without getting any closer to a deal. And Elsevier will be as keenly aware as the poor UK trade negotiators that a false move in the push for a deal with someone who does not need to compromise simply hardens resistance to compromise. Those of us who have relied on the “surely good sense will prevail amongst people of good will on both sides” argument begin to despair of both sets of negotiations.
So what happens in a digitally networked world when parties fail to agree? Those with most skin in the game get hurt first. When the music industry faced the problems of download and disc burning it wasn’t strict enforcement of copyright that saved them from users who knew what they wanted and had the technology that could do it. Instead, music owners and distributors were faced to accept co-operation withe the technology plays as the price for continued participation in a, for them, smaller but still profitable market. And with that came consolidation, a different sort of investment profile and and a new relationship with the only really powerful people in a networked world – the end users.
And in Elsevier’s world those users have never been more powerful. As Joe Esposito rightly suggests in Holly Else’s Nature article (19 July), there is Sci-Hub for a start. But then there is more than that. Social networking has already been widely used to distribute articles. Many academics are acutely aware of who they most want their readers to be and regularly circulate to them. “Good enough” publishing on pre=print servers proliferates Institutional and individual reputation management raises its game. It is not that the whole and holy progress of traditional academic publishing comes to a halt – simply that water finds its way round a dam – and then gets used to and deepens the new water courses. Do we really needs articles? Can we just report the data? Can that and our discussions about it be cited? High level science research already carries huge cost and time pressures around research publication. Elsevier must be anxious in Germany about creating a breakpoint that drives publication in-house.
At this point I always find it useful to ask a silly question. And a favourite is “What would Steve Jobs have done with this problem?” Irrational responses do sometimes win markets. And Jobs after all responded to the levelling off of consumer computer markets by inventing the computer as a hub, to run iPhone, iPad, iPod etc. So, Steve, what do you think?
STEVE JOBS: “Well, I would scrap this Projekt Deal for a start. Its going nowhere. Just walk away. Tell them you are not interested anymore…
Then I would go to the Federal government and say ‘Can you get the research institutes, the universities and everyone concerned with research funding round a table? We have a plan to increase German research funding by 5-7% per year for five years without it costing the German taxpayer a cent.’
Then I would say to my people at Elsevier: we are the technically best equipped company in the sector. For 25 years we have invested in Science Direct, Scirus, Scopus, SciVal and the rest. We know the future is not in journals or even in content but we find it hard to divorce from the past and embrace the future. So we need a learning experience, to teach us how our next market works. But it comes at a price.
Then I would say to the German government: We want the contract for intelligent services and risk management in German research. We will put all our technologies into this deal. Its scope will be providing your research communities with ways of mapping prior and current work, in Germany and elsewhere, evaluating success or failure in current work , providing intelligent tools to give every researcher full-beam headlights in their niche, showing German research where its major collaborative possibilities and competitive pressures where, giving government and institutions unique insight into where quality of outcomes lies – and where current funding is being wasted. We offer you a five yer deal to populate all your systems with our knowledge, and since we are learning, building alongside you, and developing some new things on which you can earn royalties in future, we also offer you a special price. Now can we start negotiating?
Oh, yeah, I almost forgot. We also have a special workflow deal – help us make the smoothest, hassle free workload system of uploading QA articles and you will never pay more than $1000 per article in APCs. And, as I always said at the end of presentations, One More Thing… all access to ALL journals to all users is completely free to all Elsevier registered German users for the life of this contract.”
There are of course no instant solutions and no predictability. But RELX investors, industry analysts and anyone trying to get an IPO off the ground will be hoping that someone somewhere will be able to find a breakpoint – here as well as with Brexit.
Jul
13
Everything is AI, except the business model…
Filed Under Artificial intelligence, B2B, Big Data, Blog, data analytics, internet, machine learning, Publishing, RPA, Search, semantic web, Workflow | Leave a Comment
Dear reader, I am aware that I have been a poor correspondent in recent weeks, but in truth I have been doing something I should have done long ago: gaining some experience of AI companies, talking to their potential customers and reading a book. Lets start at the end and work backwards.
The book that has eaten the last week of my life is Edward Wilson-Lee’s fine new publication, The Catalogue of Shipwrecked Books, which describes the eventful life of Christopher Columbus’ illegitimate son, Hernando, and his attempts to build a universal library of human knowledge. Hernando collected printed works, including pamphlets and short works, in an age when many Scholars then still regarded all print as meretricious rubbish. He built a catalogue of his collection, and then realised that he could not search it effectively unless he knew what was in the books, so started compiling summaries – epitomes – and then subject indexing, as well as inventing hieroglyphs to describe the physical properties. In other words, in the 1520s in Seville he built an elaborate metadata environment, but was eventually defeated by the avalanche of new books pouring out of the presses of Venice and Nuremburg and Paris. Wilson-Lee very properly draws many parallels with the early days of the Internet and the Web.
As I closed this wonderful book, my mind went back to an MIT Media Lab talk in 1985 given by Marvin Minsky. We need reminding how long the central ideas of AI have been with us. At the end of his talk, the Father of AI kindly took questions, and a tame librarian in the front row asked “Professor, If you were looking back from some inconceivably distant date, like, say, 2020, what would surprize you that you have in 2020 but which we do not have now?”. After a thoughtful moment, the great man replies “Well, I guess that I would praise your wonderful libraries, but still be surprized that none of the books spoke to each other”. At that he left the room, but from then the idea of books interrogating books , updating each other and creating fresh metadata and then fresh knowledge in the process of interaction has been part of my own Turing test. So I find it easy to say that we do not have much AI in what we call the information industry. We have a meaningless PR AI, a sort of magic dust we sprinkle liberally (AI-enhanced, AI-driven, AI- enabled etc) but few things pass the “books speaking to books and realising things not known before” test.
And yet we can and we will. The key questions are, however: will current knowledge ownership permit this without a struggle, and will there be a dispute over the ownership of the results of these interactions? This battle is already shaping up in academic and commercial research, so it was dispiriting to find when talking to AI companies that it seems there is really no business model in place yet enabling co-operation. Partly this is a problem of perception. Owners and publishers see the AI players as technicians adding another tier of value under contract – and then going away again. The AI software developers see themselves as partners, developing an entirely new generation of knowledge engine. And neither of them will really get anywhere until we all begin to accept the implications of the fact that no one, not even Elsevier, as enough stuff in one place to make it work at scale. And while one can imagine real AI in broad niches — Life Sciences – the same still applies. And if we try it in narrow niches, how do we know that we have fully covered the crossovers into other disciplines which have been so illuminating for researchers in this generation? In our agriscience intelligent system how much do we include on food packaging, or consumer market research, or plant diseases, or pricing data?
So what happens next? In the short term it is easy to envisage branded AI – Elsevier AI, Springer Nature AI? I am not sure where this gets us. In the medium term I certainly hope to see some data sharing efforts to invest in AI partnerships and licence data across the face of the industry. It is true that there are some neutral players – Clarivate Analytics for example and in some ways Digital Science – who are neutral to the knowledge production cycle and have hugely valuable metadata collections. They could be a vital building block in joint ventures with AI players, but their coverage is still narrow, and in the course of the last month I even heard a publisher say “I don’t know why we let Clarivate use our data – we don’t get anything for it!”.
Of course, unless we share our data we are not going to get anywhere. And given the EU Parliament rejection of data metering and enhanced copyright protection last week all these markets are wide open for for massive external problem solving – who remembers Google Scholar? The solution is clear – we need a collaborative model for data licensing and joint ownership of AI initiatives. We have to ensure that data software entrepreneurs get a payback and that investment and data licensing show proper returns, just as Hernando rewarded the booksellers who collected his volumes all across Europe. In a networked world collaboration is often said to the the natural way of working. It is probably the only way that AI can be fully implemented by the scholarly communications world. Hernando died knowing his great scheme had failed. AI will succeed if it shows real benefits to research and those who fund it. As it succeeds it will find other ways of sourcing knowledge if those who commercially control access today are not able to find a way of leading the charge, and not dragged along in its wake.