Jan
18
Workflow from the Bottom Up
Filed Under Big Data, Blog, Industry Analysis, internet, Publishing, Reed Elsevier, Search, semantic web, social media, STM, Uncategorized, Workflow | 5 Comments
Trends and trending analysis are one thing, making an impact on the way people work is often quite another. So while I respectfully write up the huge progress being made to provide large scale tools for analytical discovery in unimaginable quantities of data, a small portion of me remains skeptical about the impact of these developments in the short term on the working lives of professionals. Look at researchers in science and technology: you can readily imagine the impact of Big Data on Big Pharma, but can you so easily imagine what this will mean in materials science? Or can you see how the workbench performance of the individual researcher in neuroscience might be impacted? Its tough, and because it is tough we go back to saying that the traditional knowledge components will last the course. So if you have a good library, access to a reasonable collection of journals and the ability to network with colleagues then that is enough. Or Good Enough, as we keep saying.
So when I read the words “This is important not only for the supplementary data accompanying one’s experiment, but even negative results” I came alive immediately and read consciously what I had hitherto skipped. You see, in all the years that I have spoken with and interviewed researchers, when we get off the formal ground of OA or conventionally published articles, or the iniquities of publishers and the inadequacy of librarians, we get back to some stubborn issues that cling to the bottom of the bucket. One is what do you do with the remaining content derived from the research process which did not get into the article, where it was summarized and where conclusions were drawn from it. I mean the statistical findings, the raw computations. the observations and logs, the audio and video diaries, the discarded hypotheses etc. Vital stuff, if anyone is going to walk that way again. Even more vital is the detritus of failure: the experiment which never made a paper since it demonstrated what we already know, or where the model proved inadequate to demonstrate what we sought to show. Researchers going back to find why a generation of research went astray from a finding that proved fallible often need this content: in terms of detective fiction it is the cold case evidence. Yet more often than not it is not available.
So here is what I found in the nearly discarded press release. Nature Publishing’s Digital Science company (yes, them again!) have refinanced figshare (http://figshare.com) and yesterday they relaunched it. What does it do? It archives all the stuff I have been talking about, providing a Cloud environment with unlimited public public storage. They call it “a community-based open data platform for scientific research”. I call it a wonderful way of embedding research workflow into a researchable storage environment that eventually becomes a search magnet for researchers wanting to check the past for surprising correlations. At the moment it is just a utility, a safe place to put things. But if I just add a copy of the article itself then it becomes a record of a research process. Put hundreds of thousands of those together and then you have a Big Data playground. Use intelligent analytics and new insights can be derived, and science moves forward on the tessellate of previous experimentation – only quicker, with less effort and more productivity for the researcher. And much less is lost, including the evidence from the wrong turnings that turned out to be right turnings. (http://digital-science.com/press-releases/)
So will there be 20 of these? Well, there may be two, but if figshare gets an early lead perhaps there will only be one. After all , the reason researchers would come to value this storage would be having their content in close proximity to others in their field. And while early progress is likely to run quick in Life Sciences, this application has relevance in every field of study. And it also calls into question ideas of what “publishing” actually is. By storing and making available these data, are figshare “publishing” them. They are certainly not editing or curating them. Network access alters many things and here, once again, it catches publishing on the hop. If traditional publishers confine themselves to making margins solely from the first appearance of an article then traditional publishing in this sector is in severe difficulty, whatever happens to the Open Access debate. Elsevier and Nature clearly get it: go upstream in value terms or drown in commoditized content where you are. But does anyone else see it? And why not?
Comments
5 Comments so far
As a librarian, the comment about the “inadequacy of librarians” struck me- can you elaborate on what you mean by this?
Simply this . Talking to researchers I always seem to spend the first 10 minutes hearing their views on how both publishers and librarians have failed them . Publishers in these conversations are centered on making money and are risk averse.Librarians are characterized as being irrelevant to the activities on the research bench , remote from the real world and trying to re-invent their jobs to avert redundancy. Criticisms of both are overstated of course but these attitudes do explain why researchers seek workflow solutions , not help from their traditional intermediaries.
Interesting, thanks for expanding. I would argue (perhaps inevitably, given my vested interest and role as an Institutional Repository person) that librarians continue to do useful work in selection, resource management and access, open access advocacy and systems, information and digital literacy, and doing interesting things with study spaces. Certainly all this happens at my institution. I admit it’s difficult to be “there” at the bench, but we do collectively try- for example, repository people are looking at ways to help researchers with managing “big data”. Perhaps the problem is that we collectively aren’t always good at promoting this host of services- but then in some ways, it’s the librarian’s role to be “in the background”.
Anyway, end of rant, thanks for a thought-provoking post. Figshare looks like an excellent resource, and the Figshare people have been prominent in repository and data management circles recently.
Thanks for an insightful post. I’m curious why you say at the end ‘Elsevier and Nature clearly get it’ – Nature yes, given that NPG are investing in figshare, but I haven’t seen a mention of Elsevier here. Perhaps it is in a previous post of yours – if so, could you direct me to it?
Anna I have made constant reference to Elsevier in these blogs – please search under “elsevier ” to see – but the point I am really making here is that when others gave up the technology arms-race ( Springer) or went to database and no further ( Wiley) , Elsevier committed a constant stream of investment over a decade ( Science Direct , Scirus for search , Scopus for abstracting , SciVerse for workflow applications , article enhancement , citation support etc ) which is the direct opposite of the Nature approach ( investing in start-ups and trying to spot niche market gaps ) but which brings you out at the same point in the end , which is support for the researcher at the workbench in pursuit of his job functions . The effect of both is to move beyond mere article publishing into a full service environment . As articles are shared more widely in the network and , despite ownership restrictions , become more widely and freely available . this move up the value chain will become more and more important to those who can make it , but it is a continuing process and is never quite completed .