Dec
31
Simple Rules for New Years Blogging
Filed Under Artificial intelligence, Big Data, Blog, data analytics, Education, Industry Analysis, internet, machine learning, mobile content, news media, Uncategorized, Workflow | Leave a Comment
Apologies to those kind readers who expected an earlier interjection in December. Truth to tell, I was speechless. Caught somewhere between astonishment at my fellow countrymen’s mania for national self harming, my own complete self-identification historically, culturally and pschychologically as a. “European”, and impatience with all the wise and honest Americans who I know and who cannot collectively somehow re-enact the Emperors clothes nursery tale, there suddenly seemed nothing left to say worth saying, least of all around the topic of electronic information and digital society.
But then I returned to Nova Scotia again for the holidays, and in its clear, cold, sunny air it seems a dereliction of a bloggers duty not to have a message at New Year. And by dint of looking over everyone’s shoulders, I see that Rule One of the New Year message is to make a recommendation, preferably to nominate something as the something of the Year. And as it happens I do have a Book of the Year for this information industry. Please read The Catalogue of Ship-wrecked Books, by Edward Wilson Lee. The inevitable pesky publishers sub-title in the US purports to sell it as a book about Christopher Columbus and his son, but the UK edition hits the point – it is about the attempt by Columbus’s son to build a universal library in Seville, getting royal patronage and setting up buying agents in the great early cities of print to create an early Internet Archive, making available a stream of knowledge as rich as the gold and silver of Peru and Mexico just then flowing into the royal coffers.
The attempt fails of course, but it does set off arguments about the nature of Knowledge which we need to keep having as we dimly perceive the arrival of the leading edge of the development of knowledge products and solutions. And here comes Rule Two: Issue a Warning. And here is mine – Refrain in 2019 from labelling everything you see as AI sourced, related or derived. We are still in the Colon Columbus stage in building the universal knowledge base. Let’s save AI as a term for when AI arrives. Many people are doing really clever things, but they are at best embryonic knowledge products. We are really quite far away from new knowledge created in a machine-driven context without human intervention. Indeed we are still a long way from getting enough information as metadata in a machine understandable form, and when we do we usually do not understand what we have done.
So here comes Rule Three: declare a News Story of the Year. And here is mine. The gracious acknowledgement by Google that their automated recruitment system, which analyses thousands of CVs to produce the best candidates, had a male bias built in to it. And of course it did! Feed the past into an expert system and it replicates the flaws of the past. And its not that the systems doing the analytics are not clever, its just that the dumb data and the dumb documents are not as dumb as we think, and in fact they are larded with all of the mistakes we have ever made. And we need to know that before we evaluate the outcomes as Intelligent, or even believable.
And if we need to be careful about the nature of the information we are using, we need to deal in known quantities. Rule Four: try to make an insight. Mine concerns differentiating between data and documents. The other night, as one does on cold and isolated coastline, we fell to discussing derivations. My wife produced her weightlifters copy of Merriam Webster, and we got into derivations old-style. Datum, neutral, is always related to single objects of an incontrovertible nature. Docuumentum carries the idea of learning throughout its history. When we talk about content-as-data, what do we really mean? And when we talk about AI, do we speak of Intelligence created by machines deriving knowledge from pure data, or of machines learning from knowledge available, fallacies and all, in order to postulate new knowledge? We do need to be clear about our, as derived from our inputs, or we will surely be disappointed by what happens next. We need to start listening very carefully to conversations about concept analysis, concept-based searching and conceptual analysis.
Which logically brings me to Rule Five. End with a prediction. Mine would concern a question I asked in several sessions at Frankfurt this year and have had little but confusion as a result. My question was “What proportion of your readership is machines, and what economic benefits does that readership bring to you?”. I think machine readership will become much more important in 2019, as we seek to monetise it and as we seek to evaluate what content in context means in the context of analytical systems. So just as none of us knew how many machines were reading us this year, next year I think most of us will be aware. And whether those were just browsers, or bots, or knowledge harvesters, or what?
And then I notice there is a Rule Six. You end by wishing every kind reader who reaches this point a happy, healthy and prosperous New Year, which I do for all in 2019. After all, using my rule-based system this column could be written by a machine next year – and read by one too!
Apr
20
Speed Up to Slow Down
Filed Under Artificial intelligence, B2B, Big Data, Blog, Education, eLearning, Financial services, healthcare, machine learning, news media, Publishing, RPA, Search, semantic web, STM, Thomson, Uncategorized, Workflow | Leave a Comment
Since I last wrote a piece here I am older by three conferences and an exhibition. And no wiser for having spoken twice on cyber-security, a subject that baffles me every time I stand up to talk about it. The simple truth is that the world is changing in the networks at a pace that bewilders, yet the visions we have of where we are going hang before us like a tantalising but currently unattainable vision. Thus, if you ask me about the future of education, I can spin you a glowing tale of individuals learning individually, at their own pace, yet guided by the learning journey layer out by their teachers, who have now become their mentors. The journey is self-diagnostic and self-assessing, examinations have become redundant and we know what everyone knows and where their primary skills lie. Or in academic or industrial research, projects are driven by results, research teams recruited on that basis, and their reputation is scored in terms of the value their peers set on their accomplishments. The results of research are logged and cited in ways that make them accessible to fellow researchers in aligned fields – by loading and pointing to evidential data, or noting results and referencing them on specialized or community sites, or by conventional research reporting. Peer review is continual, as research remains valid until it is invalidated and may rise and fall in popularity more than once. And so on through business domains, medicine and healthcare, agriculture and the whole range of human activity…
But at this point, when I talk about the growing commonality of vision, the role of workflow analysis, RPA, what happens next with machine learning, the eventual promise of AI, a hand shoots up and I find myself answering questions from the ex-CFO/now CEO about next years budget, and when will the existing IT investment pay back, and can this all be outsourced and surely we don’t need to do any more than buy the future when it arrives? And of course these questions are all very pertinent. We all need to assure revenues and margins next year if we are to see any part of this future. And next years revenues will come from products and services which will look more like last years than they do like the things we shall be doing in 2025, even if we had an idea of what those might be. It is one thing knowing something about the horizons, quite another to design a map to get there. So at every point we seek every way we can to buttress future-proofing, and at the moment I am seeing a spate of that in acquisitions. Just as last year putting the word “Analytics” at the end of your name (Clarivate, Trevino) added a billion to the exit valuation, so this year the dotai suffix has proved to be a real M&A draw.
But those big Analytics sales were made, and will be onsold to people who want to expand their data and services holdings. The .ai sales are transplants from the seedbed, and far earlier stages of transplantation are involved. Having worked for some years as an advisor to Quayle Munro (now, as an element of Houlihan Lokey, part of one of the largest global M&A outfits) I realise that smaller and smaller sales may not be considered a good thing, but I cannot resist the idea that seeking some future tech developments into your incubator environment is going to have some really beneficial long term effects. It already has at Digital Science. As Clarivate lerans from what Kopernio knows it will help . As the magic of Wizdom.ai rubs off on T&F, it will help there.
But, again, we are begging a hundred questions. Can you really future proof by buying innovation? Well, only to a limited effect, but by having innovators inside you can learn a lot, at least from their different perspective on your existing customers. Don’t you need to keep them from being crushed by the managerial bureaucracy of the rest of the business? Yes, but why not try to fee up the arthritic bits rather than treating the flexible bits? What if you have bought the wrong future tech? Even the act of misbuying will give you useful pointers the next time round, but if you have bought the right people they will be able to change direction. What if software people and text publishing people do not get on? They will need to be managed – this is your test – since if we fail the future will be conditioned entirely by software giants licensing data from residual fixed income publishers.
Are there any conditioning steps I should be taking to ease into this future? Yes, forget ease and go faster. Look first at your own workflow. To what extent is workflow automated? Do you have optimum ways of processing text? Are people or machines taking the big burdens on proof reading, or desk editing or manuscript preparation? Is your marketing as digital as it could be? Are you talking the language of services, and designing solutions for your users, or are you giving your users reference sources and expecting them to find the answers? Indeed, do you talk the language of solutions, or the ritual language of format – book, journal, page, article. Are we part of the world our users are entering, or are we stuck in the world they are exiting? The exhibition I attended this month was the London Book Fair. I love it in all its inward-looking entrancement with itself, and its love affair with the title Publisher, the profession for which no qualification other than skill at explaining away unsuccess has ever been required. I can only take one day since I rapidly become depressed. But still there were very sparky moments – an impromptu discussion with the Chennai computer typesetter TNQ (www.tnq.co.in) about their ProofControl 3.0 service told me that these guys are on the ball. But moments like this were rare. More often I felt I was watching the future – of the industry in 1945!
keep looking »