” Well,” she said, in a determined and slightly defensive tone of voice “the last thing we intend to do is turn ourselves into a software house. We are publishers and cannot be expected to understand software, which is such a terrible distraction.  I took my correction silently and philosophically. After all, this has been a litany I have heard for a decade. And the paradox is that the more that software controls and modulates the way publishers create content, and the more it dominates the way in which users view it, the more permissible it has become for very senior executives in all sorts of places that do “publishing” (rapidly becoming a meaningless term) to proclaim with pride their ignorance of some of the basics. Some tell me “that’s what we have a CTO for”, while others tell me that it is not a creative area (yes, really!!) of their business. The largest decisions a modern information industry CEO will make concern software. The delivery – critical decisions a publisher of romantic novels will mainly concern software. We cannot avoid it – so surely every senior executive should know enough to intelligently quiz the CTO, outside suppliers and potential alliance partners?

You have now been reading for about 60 seconds. During that time, the software that holds us all in place in the network has been mightily engaged. 168 million emails were sent during that time. 694,445 Google searches took place. 320 new Twitter accounts were opened and 58,000 new tweets were posted. 600 new videos were posted – I could go on (courtesy of Go-Globe.com) but I hope you get the message. If people who run businesses they call “publishing” do not understand the platform upon which they stand, as they once understood the possibilities of print, then what hope is there for the traditional end of the market? I meet very many CEOs in the course of a working year, and in every 10 there are three who are brilliant on the bedrock technologies that drive their businesses. There are three more who struggle but know it is important. After that come four who do not really see it at all, and this group is strongest in the areas of greatest current risk – consumer book publishing, magazines and events. I almost feel as if there should be a test: Differentiate and suggest how you would use HTML5 and XML. Distinguish RDBMS databases from NoSQL databases and explain the advantages of the latter over the former. What does Epub3 allow you to do that you could not do before? How would you use the Cloud to support a reduction in Capex in your development programme? What is Open Source and is it cheaper or more expensive than proprietary software? What is entity extraction and how do I use the semantic web to add value? What is SaaS and how does it create scope for your expansion?

Readers here would doubtless have no difficulty with any of this. But still, we all – me especially – need a jolt of recognition of the speed of change at the moment. Venture Capital has a current investment of some $16 billion in SaaS software alone, with about half in business functionality (FactSet). The global software industry will top $1.1 trillion in 2016 (Gartner) – for comparison the current sizing of the publishing and information marketplace is $400 billion today (Outsell). Gartner see media and communications as a key area for the software businesses, with 2012’s spend by the sector of $61 billion rising to $78 billion in 2016. Some $21 billion of this spend ($25 billion in 2016) is for media-specialized applications. This is the top ranked vertical sector for software in 2016. My argument at the moment would be that this sale is resting on the shoulders of a very small, technically-capable group of senior buyers – it is time now for a better informed cadre of senior management to help to bear this burden, and for boards to have a generally better informed decision-making discussion which is capable of putting the view of the CTO and the professional advisers and evaluators into the medium term context of the business.

Finally, lets just look at one section of the information waterfront. This week I noted with interest the acquisition of a company called Edaboard.com by Design World (WTWH Media LLC). This brings together a leading brand which provides information services on design engineering with a community-driven forum focussed on electrical engineering topics. We have seen deals like this, and will see a lot more. But the decisions that come next – one platform or two, in-house or outsource, service integration, Cloud-based services etc will be critical to the success of this investment. I do not know Design World and I have no reason to believe that they are not fully capable of doing the job, but in many corporations of my acquaintance many of these decisions would be taken by a small coterie of tech-savvy operators, with some of the most senior people acting in faith and trust that someone else had made the right tech bets.

In a great allusion, Mike Olson of Cloudera remarked that “We are living through a Cambrian moment in database history”. As the Age of Data morphs into Data Science, we all struggle to keep up. As major data concentrations meet the Cloud, and we have to work with PaaS (Platform as a service) and DaaS (database as a service) it is not going to get slower or easier. But it is clear that the boundaries around “what we need to know to do the job” are radically changed as well. None of us should be frightened about going back to school!

Phil Cotter’s comment on last week’s post here really got me going. Now that I know that suicide bombers max their credit cards before setting off to do the deed I somehow feel a gathering sympathy for the security services. So the starting point is 5 million up-to-the-limit cards? We need to funnel cash into predictive analytics urgently if anything we do is to show better results than airport security (to begin from a very low measure indeed). So I began to look for guidelines in the use and development of predictive analytics, thinking that while we wait for terrorist solutions we might at least get a better handle on marketing. I am surprized and impressed by how much good thinking there is available, so in the spirit of a series of blogs last year (Big Data: Six of the Best) here are some starting points on innovative analytics players who all have resonance for those of us who work in publishing, information and media markets. And a warning: the specialized media in these fields all seem to have lists of favoured start-ups enttitled “50 Best players in Data Analytics”, so I am guilty of scratching lightly at the start-up surface here.

In the same spirit of self-denial that drives me to abstain from a love of eating croissants for breakfast, I have also decided to stop using the expression “B** D***”. I am so depressed by publishers asking what it means, and then finding that, because of “definition creep” or “meaning drift”, I have defined it differently from everyone else, including my own last attempted definition, that I am going to cease the usage until the term dies a natural, or gets limited to one sphere of activity. So Data Analytics is my new string bag, and Predictive Analytics is the first field of relevant activity to be placed inside it. Or do I mean Predictive behaviour analytics?

I was very impressed by analysts studying our use of electricity (http://www.datasciencecentral.com/profiles/blogs/want-to-predict-human-behavior-use-these-6-lessons-based-on-data-). Since the work throws up some lessons which we should bear in mind as we push predictive analytics into advertising and marketing. The thought that it was easier to influence human populations through peer pressure and an appeal to altruism, as against offers of “two for one”, cash bonuses and discounts is clearly true, yet our behaviour in marketing and advertising demonstrates that we behave as if the opposite was the case. The emphasis on knowing the industry context – all analytics are contextualised – and the thought that, even today, we tend to try to make the analysis work on insufficient data, are both notions that ring true for me. We need as well to develop some scientific rigour around this type of work, using good scientific method to develop and disprove working hypotheses. Discerning the signal from the noise, like “never stop improving”, are vital, as well as being hard to do. I ended this investigation thinking that even as the science was young, the attitudes of users as customers were even more immature. If we are to get good results we have to school ourselves to ask the right questions – and know which of our expectations are least likely to be met.

Which brings me to the people we should be asking. Amongst the sites and companies that I looked at, many were devoted from differing angles to marketing and advertising. But many took such differing approaches that you could imagine using several in different but aligned contexts. Take a look for example at DataSift (www.datasift.com). It now claims some 70% accuracy (this is a high number) in sentiment tracking, creating an effective toolset for interpreting social data. Here is the answer to those many publishers in the last year who have asked me “what is social media data for, once you have harvested it?” Yet this is completely different from something like SumAll (https://sumall.com), which is a marketeers toolset for data visualization, enabling users to detct and dsiplay the patterns that analysis creates in the data. Then again, marketing people will find MapR (www.mapr.com) fascinating, as a set of tools to support pricing decisions and develop customer experience analytics. Over at Rocket Fuel Inc (www.rocketfuel.com) you can see artificial intelligence being applied to digital advertising. As a great believer in sponsorship, I found their Sponsorship Booster modelling impressive. This player in predictive modelling has venture capital support from a range of players, from Summit to Nokia.

When the data is flowing in real time, different analytical tools are called for, and MemSQL (www.memsql.com) has customers as diverse as Zynga, and Credit Suisse and Morgan Stanley to prove it. Zoomdata (www.zoomdata.com) is a wonderful contextualization environment allowing users to connect data, stream it, visualize it and give end-user access to it – on the fly. This is technology which really could have a transformative effect on the way that you interface your content to end users, and you can demo it on the Data Palette on the site. And finally, do you have enough of the right data? Or does some government office somewhere have data that could immensely improve your results? Check it on Enigma (press.enigma.io), the self-styled “Google of Public Data”, a discovery tool which could change radically product offerings throughout the industry. Perhaps it is significent that the New York Times is an investor here.

So, for the publisher who has built the platform and integrated search, and perhaps begun to develop some custom tools, there is a very heartening message in all of this. A prolific tool set industry is growing up around you at enormous pace, and if these seven culled from the data industry long lists are anything to judge by, the move from commoditized data increasingly free on the network to higher levels of value add which preserve customer retention and enhance brand are well within our grasp.

« go backkeep looking »