If you are an STM publisher reading this, then it may already be too late for you to act decisively enough to put yourself in the vanguard of change. For I am not the first to say what I am about to say, and there is now a good literature based around the idea that the network is a world of small beginnings, followed by mass change at unprecedented rates that catch whole industries unawares. We are coming to one of those points, and my growing realization was triggered into certainty by being sent a link to a Harvard Business Review article from November 2010 (thank you, Alexander van Boetzelaar for making sure I saw this).  Since HBR as an old world publisher makes a business of paid-for reprints I cannot give the link, but it is reprint R1011B.

The article is called “The Next Scientific Revolution”, by Tony Hey, a director of Microsoft Research and one of the Fourth Paradigm people who made such an impact in 2009. Their arguments, pioneered by the late Jim Gray, saw scientific enquiry gathering force as the experimental methods of early Greece and China were subsumed into the modern theoretical science of the Newtonian age, and then carried forward through computation and simulation into the age of high performance computing in the last century. So now we stand on the verges of a fourth step , the ability to concentrate unprecedented quantities of data and apply to it data mining and analytics, that, unlike the rule-based enquiries of the previous period, are able to throw out unsuspected relationships and connections that in turn are the source of further enquiry.

All of this reminds me of Timo Hannay of Nature and his work with the Signalling Gateway consortium of cell science researchers based in San Diego. I am not sure how successful that was for all parties involved, and to an extent it does not matter (especially given the lead time in experience given to Nature by this work). To me this was a signal of something else: on the network the user will decide and make the revolutionary progress, and we “publishers” will have to be ready in an instant to follow, developing the service envelope in which users will be able to do what they need to do. At the moment we are all sitting around in STM talking about overpublishing, the impossibility of bench science absorbing the soaring output of research articles, or libraries to keep up on restricted budgets, when the real underlying problem we are not seeing is the fact that the evidence behind those articles is “unpublished” and unconcentrated, and that as the advanced data mining and analytics tools become increasingly available they have insufficient scale targets in terms of collected data.

Of course, there are big data collections available. And their usage and profitability is significant. Many are non-profit and some are quasi-monopolistic. But I see huge growth in this area, especially in physics, chemistry and the life sciences, to the point where “evidence aggregation and access management and quality control” is the name of the business, not journal publishing. Mr Hey comments in his article “Critically, too, most of us believe scientific publishing will change dramatically in the future.”  “We foresee the end product today – papers that discuss an experiment and its findings and just refer to datasets – morphing into a wrapper for the data themselves, which other researchers will be able to access directly over the internet, probe with their own questions, or even mash into their own datasets in creative ways that yield insights that the first researcher may never of dreamed of.”

What does “access directly” mean in this context? Well, it could mean that universities and researchers allow outside access to evidential data, but this poses other problems. Security and vetting loom large. Then again, evidential peer review may be a requirement – was the evidence created accurately, ethically or using reliable methodologies? Plenty of tasks for publishers here. Then again, can I hire tools to play in this sandpit? Is the unstructured content searchable, and is metadata consistent and reliable? These are all services “publishers” can offer, in a business model that attracts deposit fees for incoming data as well as usage fees. But there will be natural monopolies. It may be true, as Mr Hey claims, that “through data analysis scientists are zeroing in on a way to stop HIV in its tracks”, but how many human immunodeficientcy virus data stores can there be? Right, only one.

So the new high ground will have fewer players. A few of those will be survivors from the journal publishing years, and I hope one at least will have the decency to blush when recalling the pressure put on people like me, in my EPS days, to remove the ever-growing revenues of the science database industry (human genomics, geospatial, environmental, for the most part), from the STM definition since it was not “real” science publishing – and reduced their share-of-market figures! But then again, maybe they should look around them. Isn’t what is being described here exactly what LexisNexis are doing with Seisint and Choicepoint, or Thomson Reuters with Clearforest. And why? Because their users dictate that this shall be so. For the same reason this is endemic in patent enquiry: see my erstwhile colleague David Bousfield anatomizing this fascinatingly only last week (https://clients.outsellinc.com/insights/index.php?p=11416). And why have market-leading technology companies in this space – think of MarkLogic and their work on XML and the problems of unstructured data – made such an impact in recent years in media and government (aka intelligence)? I see a pattern, and if I am right, or even half right, it poses problems for those who do not see it.

I rest my case. Next Friday I shall do the Rupert Murdoch 80th birthday edition, for which I plan to bake a special cake!

Now , strategy is simple , execution is the real difficulty . Having written strategy for my friends in the industry for the past 25 years , I know the truth of that . And if we are going to deal in truth for a change , I was a dab hand at strategy as a digital law publisher , but found turning those elegant bullet points into service values and USPs that people would pay for a far more difficult game .

So here is a chance to salute a master this week , and at the same time acknowledge another truth : to be a maestro you need an orchestra , and it is very difficult to execute anything in a place which is not receptive to change . So it is a good job that Dr Timo Hannay works at Macmillan , where they have produced a management that welcomes change , and a trading atmosphere that concentrates on the essentials while coping with customers forever on the move and shifting their priorities . The strength of this mix is shown in last weeks announcement of the long-awaited Digital Science Ltd , which solves two problems in one : ” How does Macmillan/Nature punch above its weight in a market of larger players like Elsevier, Wiley and Springer ? ” and ” How do we find a suitable role for our chief technology change agent  and strategy inventor  given that his Nature inventions must now be given time to shake down and mature ? ”

In another age that second question would have been disallowed . At length we are beginning to realize in the industry formerly known as publishing , that talent is scarce and must be nurtured . And the first qusetion would have been answered by lateral growth : publish more things in more subjects . Fortunatly , the networked publishing world widens the options , and a content provider can now relocate himself to another place in the value chain and compete with his more traditionally minded competitors in a wholly different way . Digital Science seems to me to be a prime example of this strategy on the move . There are limits to how much can be cloned under the Nature brand . This is already a broad-based journal publishing brand now erupting into education and into collateral ebook developments . The time of rapid service experimentation is  over , and the bits that work identified and in process of being iterated ( see Nature Networks and its recent announcements ). There is clearly recognition that growth from this base is ongoing but structurally finite : any ordinary publisher at this point would make an expensive acquisition , fire half of the new staff and spend five years cutting costs while finding out which things worked and scrapping the rest .

Not the Macmillan way , at present . The option taken has been to re-concentrate on the working processes of the researcher . Not ” how can we sell him more articles ? ” but ” how can we help him to organize himself more productively , make better decisions over the content he uses from all sources , and , possibly , stay within ethical and academic guidelines for what constitutes good research ? ” In other words , Digital Science is an elegant workflow play in the making .

This sounds like a delightfully easy strategy piece to write : I may have written it myself several times in the past few years . Move up the value chain to a point in the workflow where you can provide process tools and support . Then develop said tools and become the integrated point of analysis for all content – your own , third party , and user-derived . Here you get growth , greater knowledge of changing customer behaviours and a locked-in market that finds it hard to leave the bar  once it has bought the first drink  .

But the power lies in the execution , not in the strategy . So Timo and his colleagues have beaten the bushes for tools and environments that users /researchers really respond to , and coupled them up as acquisitions to create not a 1+1=1.5 scenario , but instead a 1=1=1=4 configuration . There is chemistry in everything in science , so SureChem , a specialized text mining application ( and also a patents search engine ) was a natural building block . Macmillan bought it last year for Digital Science . Then add an equity stake in BioData Ltd , a lab management outfit designed to be a network-based answer that avoids the complexities of an Oracle enterprize solution . Bring to the boil with Symplectic , , a toolset to improve researcher productivity by tracking the writing and recording of findings to publication .As institutional repositories continue to grow , and academics and their administrators need to track versioning , control deposits and manage bibliometrics for research assessment and other exercises , this becomes more and more central .

All of this sounds like a Life Sciences concentration , and of course that would reflect Macmillan’s other interests as well as one of the fastest growth points in the sector . Symplectic will link to grants applications and proposal development , which completes another wing of the workflow . No doubt ( an old hobby horse of mine ) they will also look at the electronic lab manual as a point of synthesis for individual researchers , as well as a way of demonstarting due diligence and regulatory compliance .

 

And of course , it is not that these attributes do not exist elsewhere . Thomson Reuters have a strong holding of productivity tools for writing , linked to Web of Science . Elsevier have strength of a different kind  in science search and in abstracting and indexing services . What Digital Science appears to want to do is integrate its attributes on the research workbench and then go and get the rest of the requirements and integrate them as well . This strategy has taken a year to execute and now ( December 7 ) announce . It represents a new growth point and a pointer to where , after content , the competitive pressures will be felt . Really , I wish I had done that …

« go backkeep looking »