New book by David Worlock. Pre-order now at Marble Hill Publishers or Amazon.

A small Cotswold farm is the setting for a classic struggle of wills. Robert Worlock, eccentric and demanding, resolutely maintains the old ways, determined above all to make his son into a farmer fit to take over the family acres. His son, David, is equally determined not to be bullied into something he neither wants nor likes. His childhood becomes a battleground: can he find a way to make his father love him without denying his right to determine his own life?

 

When I first talked about open access and the decline of the scientific journal, 20 years ago, it was fortunate that I had Dirk Haank available to tell the world not to listen to demented consultants with no skin in the game. When I spoke some 15 years ago, about the inevitable declined of the subscription science Journal, it was pleasing to hear Kent Anderson reassuring us, all that I was simply a mad dog out on license. Now, as I read the strategy revision for their open access policy published by the Gates Foundation, on April 7, I am very happy to indulge the Panglossian philosophers of the scholarly communications marketplace once again and while I wait for them to tell us that nothing has really changed and everything will go on  just as before in the best of all journal, publishing worlds, I am heading down to the marketplace to link arms with Cassandra. We shall chant “ O woe! O woe ! The day of the open access, journal is nearly over, and it’s end can be told with confidence!“

Of course, this might take another 15 years. I’ve reached an age myself when time is not a very worrying factor. In the 57 years that have passed since I started work in the educational and academic publishing sector I have been acutely, aware that commercial publishers, while being politely prepared to entertain speculation about the future, have necessarily to attend to  this year’s financial results and the expectations of investors. When my speculations were deemed too far-fetched, my clients in the boardroom tended to say “our strategies are clear – follow the money!” Today, my response to them would be quick, and immediate:“Watch what the funders are doing with the money, and then, follow the data! “

Many will argue that Gates is a small funder in terms of article contributions. It’s work creates around 4000 articles a year, and through its payment of APCs it contributes a mere $6 million per annum  to the coffers of scholarly publishers . But it is an influential player and in its revised open access strategy it may have detected something which is present in the minds of the larger funders, and eventually of governments themselves. What is the duty of the funder in terms of ensuring that articles detailing research results are available to the community at large? In the time of Henry Oldenberg in the 1660s, the answer would have been to get them into the Transactions of the Royal Society. Today, it is to get them onto an authorised pre-print server with a CC-BY license as soon as possible after the research is completed and the article is ready, and to accompany it by linked datasets of the evidential material on a similar license on a similarly approved site. Speed is of the essence, access to all is key and critical. Subsequent reuse of the material in a journal, subsequent acts of peer review and downstream reuse are not the key concerns of the funding foundation. By this fresh twist in the end of its open access policy, the Gates Foundation have saved $6 million, which can now go back into the research fund . And by using F1000 , who already supply the internal Gates, publishing systems, to create F1000 Verixiv, the pre-print server of choice, they have provided tools, which researchers can use (or not) to fulfil the mandate.

If other funders follow this route, then the scholarly communications research community in science faces a choice. For many, more pressurised by getting the next research program underway than anything else, it will be simple to leave things there, and not necessarily press forward to eventual journal publication. For others, given the needs of institutions for publication, to secure tenure or satisfy other funders requirements, publication will remain essential until the way in which science results are assessed, begins to change.One of the things that I recall from conversations with Eugene Garfield, in the 1980s , was his repeated assertion that better ways than citation indexing would be found to assess the worth of science research articles. Like Winston Churchill on democracy, he maintained vigorously that what he had created was the “best worst way“ of doing the job. The challenge now, I would suggest, is whether some latter day Garfield can perform his 1956 breakthrough, and create a way of indexing and illuminating what is good science for a modern world. That measurement and indexation has to be available as soon as possible after the first appearance of the claim, wherever it appears in digital form.In the meanwhile, getting the knowledge immediately into the marketplace, and getting the data available to aide reproduceability supports other research in progress and supports integrity. And that is critical for funders and researcher alike.

Such new systems will emerge in their own time. In the meanwhile the way we measure, achievement, t which have been gamed and manipulated endlessly and need in any case to be renewed or replaced , experienceincreasing pressure,. This applies as much to peer review as anything else. If publishers are to stay in the loop, then they need to change their relationships as wellAs the relationship between Gates, andF1000 shows, whatever takes place in terms of “publication “ and where it takes place in the ecosystem may become more important to the institution or the funder to the researcher or the research lab. In terms of attracting sponsorships, investment, and industrial research cooperation,  universities may have more interest in publication than most, especially if the research community sort out a better way of ranking science than by citation indexng.(Footnote: what a clever man that Vitek Tracz was! The Tesla of science publishing! Long after his retirement, we shall be using the tools he created for white label sponsored publishing! )

So there it is! Cassandra and I have now done a full lap of the forum, and I can feel that the rotten vegetables are getting ready to fly through the air! next time, if I survive, I plan to “follow the data” myself, and look at the role of publishers as data aggregators, data curators, and data traders. and we shall remember the old saying: “how do you know if the searcher is a person or machine? Well, only machines read the full article!“

 

Who benefits is never a bad question to ask. In my mind, after long years in the information industry, it is a question closely related to “follow the money”. And it is closely in my mind at the moment, since I have been reading the UK Information Commissioner’s consultation (https://ico.org.uk/) about-the-ico/what-we-do/our-work-on-artificial-intelligence/generative-ai-second-call-for-evidence/ on the use of personal data in AI training sets and research data. The narrative surrounding consultation invokes;  for me, all sorts of ideas about the nature of trust.

Let me try to explain my ideas about trust, since I think the subject is becoming so controversial that each of us needs to state their position before we begin a discussion. For example, I trust in the brand of marmalade to which I am fairly addicted. My father was an advocate of Frank Coopers Oxford marmalade, and this is probably the only respect in which I have followed him. We certainly have over 100 years of male Worlock usage of this brand of marmalade. Furthermore, in modern times, the ingredients are listed upon the jar, together with any chemical additives. Should I suffer a serious medical condition as a result of my marmalade addiction, I can clearly follow the trial and find where it was made and the provenance of its ingredients. And in the 60 or so years that I have been enjoying it, it has not varied significantly in flavour, taste or ingredients.

I also believe, being a suspicious country man, in something that I call “the law of opposites” . Therefore, when people say that they “do no evil “ or claim that they practice “effective altruism “, then I wonder why they need to tell me this. My bias, then becomes the reverse of their intentions: I tend to think that they are telling me that they are good because they are trying to disguise the fact that they are not. This becomes important as we move from what I would term an open trust society – exemplified by the marmalade – into a blind trust society – exemplified by the “black box” technology, which , we are told, is what it is, and cannot be tracked, audited or regulated in any of the normal ways.

The UK Information Commissioner has similar problems to mine, but naturally at a greater level of intellectual intensity. In their latest consultation document, his people ask whether personal data can be used in a context without purpose. Under data privacy rules, the use of personal data, where permitted, has to be accompanied by a defined purpose. whether the data is used to detect shifts in consumer attitudes or to demonstrate the efficacy of a drug therapy, the data use is defined by its purpose. General models of generative AI, with no stated or specific purposes, violate current data protection regulation, if they use personal data in any form, and this should set us wondering about the outcomes, and the way in which they should earn our trust.

The psychologist Daniel Kahneman who died this week, earned his Nobel prize in economics for his work on decision-making behaviours. His demonstration that decisions are seldom made on a purely rational basis, but are usually derived from preferences based on bias and experience (whether relevant or not) should be ever present in our minds when we think about the outputs of generative AI.Our route to trusting those outcomes should begin with questions like: what is the provenance of the data used in the training sets? Do I trust that data and its sources? Can I, if necessary, audit the bias inherent in that data? How can I understand or apply the output from the process if I do not understand the extent and representativeness of the inputs?

I sense that there will be great resistance to answering questions like this. In time there will be regulation,. I think it is a good idea now for data, suppliers and providers to annotate their data with metadata, which demonstrates provenance, and provides a clear record of how it has been edited and utilised, as well as what detectable bias was inherent in its collection. One day, I anticipate, we shall have AI environments that are capable of detecting bias in generative AI environments, but until then we have to build trust in any way that we can. And where we cannot build trust we cannot have trust, and the lack of it will be, the key factor in slowing the adoption of technologies that may one day even surpass the claims of the current flood of Press releases about them. Meanwhile, cui bono? Mostly .it seems to me, Google, Microsoft, Open AI, Meta. Are they ethically motivated or are they in it for the money? For myself, I need them to clearly demonstrate in self regulation that they are as trustworthy as Frank, Coopers, Oxford marmalade.


keep looking »