Index to Analytics

Filed Under B2B, Big Data, Blog, Financial services, Industry Analysis, internet, Reed Elsevier, Search, semantic web, STM, Thomson, Uncategorized, Workflow | Leave a Comment

A long time ago the Financial Times formed a joint company with the London Stock exchange to exploit the FTSE share price index. I seem to recall that this was not a success, but a colleague at that time joined the board and I recall asking him what an index was for. He replied that it was a sort of branding statement, and it also said that you had the underlying data from which to create the index, should anyone want to look at it. And was that a good business? Well, not really, since few people were able to make sense of the underlying data. So it was mostly a brand thing, then? Well, yes. And a brand thing where, since most people refer to the “footsie”, the brand reference is lost in speech.

I do not believe that they have the same view at FTSE now, and in a world currently rampant with indices it is interesting to check on the progress of players like Argus Media (www.argusmedia.com) who have used indexation powerfully to elevate a small player in energy and commodity data markets into a very powerful one. I wrote about this in November 2009 (https://www.davidworlock.com/2009/11/battle-of-the-indices/) and envisaged the war between Argus and McGraw Hill’s Platts in oil markets as a classic David-Goliath story – but one which would need to be followed up by the victor to consolidate the gain with a wider service base. To quote: “index publishing is becoming an interesting value phenomenon. It creates lock-in around which workflow activities and value-add analytics can be built. It gives brand focus and recognition. It provides contract opportunities to supply and maintain service points on client intranets. In truth, it is sexier than it sounds.”

In light of this I was delighted to find that Argus Media had made an important purchase in analytics software this month. Fundalytics “compiles, cleans and publishes fundamental data on European natural gas markets” and is a first service acquisition of this type that the company has made. Starting with natural gas, however, it should be possible to create a wider range of analytics activities, across energy markets, which are currently so very active, and other commodity areas like fertilizers where the company is building a stronghold. Competition is obviously fierce, with direct pressure from Platts, about double the size, and RBI’s smaller ICIS. And then there are the market information players who have always used the data and its primary analysis to form notation services for both players and investors – the Wood Mackenzies and IHS operations, and, at a further remove, Michael Liebrich’s New (now Bloomberg) Energy Finance and Thomson Reuters’ Point Carbon. It is understandable that there would be heavy competitive pressure in such an important field, and rewards will align with the industrial, financial and political clout the whole field invokes. But of the companies mentioned here, some are primary data producers, some secondary, and some create market commentary without owning a data farm at all. Can they all survive, and, if not, what sort of equipment do you need to succeed?

This is why the Argus Media purchase is more important than its size or value. If we have learnt anything from the consolidation of service markets in the network in the past decade then it is, surely, that relatively few players are needed to provide the whole range of internet services, and that users do not lust for more – indeed, they seem to want one sure place to go, and an alternative in case their preferred supplier tries to abuse his pricing control. You could point to the history of Lexis and Westlaw in law markets for part of the history of this. Then they want from those two lead suppliers the ability to secure access to all the core data that they need: to both use that data and its analysis on the supplier service, and suck data into their own intranets to use in conjunction with their own content: to access APIs which allow them to create custom service environments and maintain them as fresh value add features are developed by the supplier: to use the supplier as the architect/engineer for workflow service environments, where news and update is cycled to the right place at the right time and where compliance with knowledge requirements can be monitored and audited: and, finally, they want the supplier to run the Big Data coverage for them, using his analytical framework as a way of searching wide tracts of publicly available data on the internet to secure connections and provide analysis which could never have existed before.

This is a formidable agenda, and I am not suggesting that anyone is close to realising it. Those who want to enter the race are probably now securing their content on an XML-based platform and beginning to buy into analytics software systems. And it was the latter point which so interested me in regard to Argus. If the human race descended from a tree shrew, then there is no reason at all why a smart data company close to London’s fashionable Shoreditch tech-zone, should not be a lead player in the future structure of service solutions for the energy and commodities markets!

Jan

29

KISS – but don’t Tell

Filed Under B2B, Big Data, Blog, Financial services, healthcare, Industry Analysis, internet, mobile content, Publishing, Search, semantic web, STM, Thomson, Uncategorized, Workflow | 4 Comments

“Keep it Simple, Stupid” was an acronym I brought home from the first management course I ever attended yet it has taken me years to find out what it really means. There are, clearly, few things more complex than simplicity, and one man’s “Simple” is another man’s Higgs Boson. So I was very energised to have a call last week from an information industry original who has been offering taxonomy and classification services to the information marketplace since 1983. When I first met Ross Leher in the late 1980s we were both wondering how far we would have to go into the 1990s until information providers recognized that they needed high quality metadata to make their content discoverable in a networked world. Ross had sold his camera shop to take the long bet on this, but he worked at his new cause with a near religious persuasion, as I realised when I went to see him in the 1990s at his base in Denver, Colorado. Denver at that time was home to IHS, whose key product involved researching regulatory material from a morass of US government grey literature. Denver people did metadata. It was a revolution waiting to happen.

So when I heard his voice on the phone last week my first emotion was relief – that he had not simply given up and retired to Florida – and then agreement. Yes, we were 15 years too early. And many of the people we thought were primary customers, like the Yellow Page companies and the phone books and the industrial directories – are now either dead or dying, or in the trauma of complete technological makeover. Ross’s company, WAND Inc (www.wandinc.com) is now very widely acknowledged as a market leading player in horizontal and multi-lingual taxonomy and classification development. They are the player you go to if you have to classify content, if you are in a cross-over area between disciplines (he has a great case study around taxonomies for medical image libraries), and if you have real language problems (“make this search work just as effectively in Japanese and Spanish”). What they do is really simple.

Your taxonomy requirement is going to start with broad terms that define your content and its area of activity. These can then be narrowed and specified to give additional granularity in any specific field. These classifications can be incorporated into the WAND Preferred Term Code, given a number, and used in a programmatic, automated way to classify and mark up your content (www.datafacet.com). Preferred terms can be matched to synonyms, and the codes can be used to extend the process to very many different languages. So someone whose company, for example, was created in Spanish can be found in the same list as someone who has a Japanese outfit, as the result of a search made by a Chinese user working in Chinese.

And from synonyms we can extend the process to extended terms themselves, and then map the WAND system to third party maps – think of UNSPSC, Harmonized Codes or NAICS, as well as those superficial and now dwindling Yellow Page classifications. WAND can isolate and list attributes for a term, and can then add brand information. All of these activities add value to commoditized data, and one would think that the newspaper industry at least would have been deep into this for 15 years. Yet few examples – Factiva is an honourable example – exist which demonstrate this.

Not the least interesting part of Ross’s account of the past few years was the interest now shown by major enterprize software and systems players in this field of activity. Reports from a variety of sources (IDC, Gartner) have high-lighted the time being wasted in internal corporate search. Both Oracle and Microsoft have metadata initiatives relevant to this, and it still seems to me more likely that Big Software will see the point before the content industry itself. With major players like Thomson Reuters (Open Calais) deeply concerned about mark-up, there are signs that an awareness of the role of taxonomy is almost in place, but as the major enterprize systems players bump and grunt competitively with the major, but much smaller, information services and solutions players, I think this is going to be one of the competitive areas.

And there is a danger here. As we talk more and more about Big Data and analytics, we tend to forget that we cannot discard all sense of the component added value of our own information. We know that our content is becoming commoditized, but that is not improved by ignoring now conventional ways of adding value to it. We also know that the lower and more generalized species of metadata are becoming commoditized; look for instance at the recent Thomson Reuters agreement with the European Commission to widen the ability of its competitors to utilize its RICs equity listings codes. This type of thing means that, as with content, we shall be forced to increase the value we add through metadata in order to maintain our hold on the metadata – and content – which we own.

And, one day, the only thing worth owning – because it is the only thing people search and it produces most of the answers that people want – will be the metadata itself. When that sort of sophisticated metadata becomes plugged into commercial workflow and most discovery is machine to machine and not person to machine we shall have entered a new information age. Just let us not forget what people like Ross Leher did to get us there.

« go back — keep looking »

Feb

17

Index to Analytics

Jan

29

KISS – but don’t Tell

Search

Recently Written

Categories

Archives

Blogroll

Links

Share & Subscribe

Admin