Blog

Intranet Now Diamond Award 2016 – open for nominations

Having just written a history of intranets I am aware of how many people have made significant contributions to the development and adoption of intranets and yet never been recognized. If you have never heard of Jennifer Stone Gonzalez and Steve Tellen (for example) they will feature in my presentation at Intranet Now. It is a pity that they will not be in the audience.  Towards the end of the conference on 30 September the winner of the Intranet Now Diamond Award will be announced, and Sam Marshall will have the pleasure of handing over a very heavy (but ever so elegant!) chunk of glass.

The Intranet Now Diamond Award is unique in that it is awarded to an individual for their remarkable contribution to the community at large. Wedge and Brian are now seeking recommendations for the 2016 Award. There are of course many intranet managers who have made a significant impact on their organisation, often as a team of one.  However they are looking for someone who is committed to raising the awareness of good intranet practice amongst the wider intranet community in the UK. It’s not as if Wedge and Brian do not know potential candidates but they are firm believers in the wisdom of crowds so would like to know who you respect as an intranet guru.

Both Steve and Jennifer would have been worthy winners in 1993 and 1998, but that’s a story for my presentation.  So could you look through the list of the people you follow on Twitter and the blogs you monitor.  But please bear in mind Wedge and Brian are looking for someone who participates in our community and not just observes it. To me there is one obvious candidate…..I wonder if you agree?  Information on how to nominate someone is on the Intranet Now site.

Martin White


Recommind finds an open door at OpenText

The announcement that Recommind had sold out to OpenText does not seem to have raised much in the way of comment. Both companies go back some way into the time line of information discovery. OpenText started life at the University of Waterloo as the outcome of a project to digitize the Oxford English Dictionary. I worked on this project in 1984 when at Reed Publishing, as readers of the 1st edition of Enterprise Search will be aware. The company now has revenues of around $2billion and the presentations to the May 12 Investor Day make interesting reading, if only because the  word ‘search’ does not appear at all!  Along the way OpenText has acquired 53 companies, many of which at the time were positioned as the next best thing to happen to the market. RedDot and Vignette come to mind. In the distant past OpenText also acquired Basis through its purchase of Information Dimensions, which was a seriously-good search application developed by a team at Battelle in the early 1970s.

Recommind is also the outcome of a research project, in this case the development of Probabilistic Latent Semantic Analysis by Thomas Hofmann, though the original paper in 1999 referred to it as Probabilitistic Latent Semantic Indexing. It is is one of a number of probabilistic topic approaches to information discovery.  Although the technology is very different to that used by Autonomy there is a common interest in finding patterns in text which will go beyond what are often described as ‘simple keyword approaches’, even if ‘simple’ is substantial misnomer. In the new world of open source search Recommind is just about as proprietary as you can get, and that makes for some problems in trying to optimise performance.Where Recommind has made a particular mark on e-Discovery is in the area of predictive coding for the analysis of texts submitted in legal cases. This has been widely adopted in the USA and is now recognised in the UK, which could have been a catalyst for the acquisition given the importance of the UK market to OpenText. The e-Discovery market is highly competitive, with kCura, FTI Technology, Nuix (of Panama papers fame) Zylab and HP along side Recommind in the Leader quadrant of the 2015 Garner e-Discovery Magic Quadrant.

OpenText has acquired Recommind for $163 million, which at (as a guess) 20-times earnings puts the company at $8m earnings on $80 million revenues. For a $2B company this is not a big buy. For comparison OpenText acquired Vignette for $310 million in 2009. What happens next is anyone’s guess, and that probably goes for the sales teams at both OpenText and Recommind given the OpenText track record. Because it is a small unit ($80 million/$2 billion) I can’t see it being retained as a stand-alone unit post the closure of the transaction in 2017. Just how Recommind is going to fit into OpenText is not yet easy to work out, as Recommind has a range of information governance applications as well as the Decisiv search application. The key executives will stay around because they will have earn-out agreements but other staff may well be brushing up their cvs.  This could make like difficult for on-going support for Recommind clients as there is very little external expertise available from search system integrators. It will also be interesting to see what happens to the recent partnership between Recommind and BAInsight.

In an ideal world OpenText would be wise to capitalise on the innovations that pervade the Recommind technology and make wider use of it in other ECM applications. Somehow, based on the history of the 53 other acquisitions, I’m not going to hold my breath. Based on the Investor Day presentations ‘search’ is not a core element of OpenText strategy.

Martin White


Relevant Search – Doug Turnbull and John Berryman

The user requirement for a successful search is very easy to state. They want the items that are most relevant to their query to appear on the first page (or at worst the first two pages!) of results. Delivering this requirement is a far greater challenge than users and search managers imagine. The very fact that Relevant Search, written by Doug Turnbull and John Berryman, runs to over 330 pages gives an immediate illustration of the scope and scale of relevance management. I often use the metaphor of looking at an automobile engine. In principle we all know how the engine works but when it doesn’t work to perfection all we can do is look at the collection of modules and wires and wonder just what we have to do to restore the performance. That’s when an engineer with plug-in diagnostic equipment is essential. They can not only spot the problem but also know the systems well enough to sort out the problem.

The reason for presenting this metaphor is that the authors have written this book for relevance engineers. This to me is a new job profile but one that I can immediately relate to. The book presents all that a relevance engineer requires to understand how to go about improving relevancy, and this requires a good knowledge of information retrieval principles and also how these principles are best translated into software code. I should state up front that the examples in the book show code from Elasticsearch or Solr open source software but that should not be seen as limiting the book to open source implementation. Indeed seeing the code will help the reader understand what is going on in any enterprise search application. After all SharePoint 2010/2013 uses the same BM25 ranking model that is now in Lucence v6.

The eleven chapters in the book cover debugging a relevance problem, understanding the role of tokens, basic multi-field search, term-centric search, shaping the relevance function, providing relevance feedback, designing a relevance-focused search application, the relevance centered enterprise and advanced search techniques. There is no other book that I know of that manages to integrate both information retrieval and search management so successfully, with just enough IR fundamentals to show the origin of a relevance problem and the basis for a solution which can be expressed in code.  I especially value the way in which the examples are based on a ‘real’ collection of information, the Movie Database. Since we all have a familiarity with movies this for me makes the book come alive.

The quality of the content is not matched by the quality of the publishing format. This review is based on the e-book version. Although there is a list of sub-headings on the PDF version the lack of an index makes it almost impossible to dip into the book to find an explanation of a feature or a solution to problem. The writing style is very conversational but this results in a lot of words with apostrophes, often where they are not needed. Overall the copy editing is patchy.

I cannot recommend this book strongly enough. It is certainly not just for ‘developers’. Search managers, and of course relevance engineers, need to appreciate the fundamentals of search technology and good practice in relevance management even if they are working with commercial applications. Students on computer and information science courses will also find it of great value and hopefully be inspired to follow a career in relevance engineering.  All I missed was a consideration of relevance management in federated search implementations, but I’m sure that the authors are saving this for the next edition.

Martin White


Intranet Content Migration – a guide to good practice

Intranets and content management software (CMS) applications both have service lifetimes of probably 4-5 years although this can sometimes be extended with strong initial and ongoing implementation. Intranet teams will have the experience and expertise needed to develop an upgraded intranet on an existing CMS but will rarely have the experience to migration to a new CMS, especially where there is a requirement to introduce a new information architecture, to clean up the amount and quality of the content and perhaps implement a new search application. As a result planning and executing an intranet content migration project become a very considerable challenge.

Intranet Content Migration is co-authored with David Hobbs  a leading authority on website content migration. As far as we are aware this is the first briefing paper to be published specifically on intranet migration. We have set out to present what in our experience is good practice for intranet content migration based on some major projects we undertook individually and together in 2014 and 2015.  Although the principles are similar to website content migration there are a number of specific technical and governance challenges that need to be addressed. Particular attention is paid to the benefits of undertaking a comprehensive planning process ahead of the commencement of migration, focusing on a content inventory process that enables informed decisions to be made on the amount of content that needs to be migrated, and the extent to which this can be accomplished using content rules rather than a time-consuming inspection and migration of each content item.

Other topics covered in this Research Note are the importance of effective risk management, the need to work through the implications for the search application for the intranet, the requirement to have a well-designed and supported communications programme and the importance of deciding how the progress of migration will be reported. Appendices list a set of ten critical success factors and some additional resources on content migration.

Martin White


The Organisation in the Digital Age – 2016 survey now open

Much of my career has been in the B2B market research business, notably with International Data Corporation and then Logica. The IT sector has always been awash with research reports from vendors seeking to justify their market position and pricing, as well as many boutique companies offering high quality research in a small sector. The value of the IDC and Logica services was that each year they used the same core methodology to highlight trends in market growth over a five year period and yet included questions in the survey which took account of recent developments. It was hard work.

All the more remarkable then that this year Jane McConnell is working solo on the 10th of her annual surveys, which started out with intranets and now assess the extent to which organisations are making a commitment to working digitally, This year the survey for the Organisation in the Digital World report is in two parts.  The Core part (59 questions), streamlined from previous years, takes approximately 30 minutes. The optional Extended part (37 questions) is for organizations that want to do a deeper dive into their digital transformation. All participants receive a copy of the final report The Organization in the Digital Age 2016 (Core or Extended), as well as the Scorecard for their organization, which is optional and free.

The innovations this year are a customised snapshot report and sponsorship opportunities for research supporters. The snapshot report is available to organisation who are able to arrange for six or more people to complete the survey. They receive 3-page summary of the consolidated results providing a snapshot from different viewpoints: functions,  business lines, or countries depending on the role of the respondents. This year vendors, digital agencies, technology and service providers, and others can participate as a Research Supporter through a sponsorship package. This brings visibility in the report, and a chance to communicate their messages to a high-potential audience.

Although the benefits to organisations of having a global perspective on digital workplace adoption is significant I know that many organisations welcome the opportunity to use the survey as a means of bringing together their digital leaders to exchange views on how adoption is taking place in specific departments and divisions. Even if the team only spend a morning together to complete the survey the near-term and long-term benefits will be substantial. I have seen too many organisations in which digital innovations are its best kept secrets! The publication later this year of both this survey and the Findwise Findability Survey will once again provide us with dependable insights into the level of commitment to digital working that can be used in planning for 2017 and beyond.

Martin White

 


Defining and managing information quality

For the last three years I have been supporting major projects that involve content migration and enterprise search. A primary objective of both migration and search is ‘to improve information quality’ but in the projects I have been involved with little attention has been paid to defining the parameters or information quality and putting in place policies and processes to improve quality. The reason for not doing so is that the staff resources required are significant, and because there is no corporate commitment by the organisation to information quality it is all but impossible to gain the support required to at least start the journey towards information quality improvement. It is indeed a journey; there are no quick fixes.

In general organisations seem unaware of the significant amount of work that has been undertaken in to defining information quality standards and guidelines, dating back to pioneering work at MIT in the early 1990s that recognized information had to be fit for purpose and not just ‘accurate’. A very good resource on the development of information quality management is a book entitled The Philosophy of Information Quality, published by Springer in 2014. This book is a collection of contributions on all aspects of data and information quality edited by Luciano Floridi and Phyllis Illari. The quality of the contributions is very high but for some unaccountable reason there is no index to the book. Springer clearly does not have a commitment to information quality!  A similar book on Data and Information Quality is about to be published by Springer, and it will be interesting to see if an index is provided. There is an earlier book on Managing Information Quality from Springer which was published in 2006.

MIT remains at the heart of information quality management. It organises an annual conference, which in 2016 takes place in Spain from 22-23 June. The papers from previous conferences can be downloaded from the conference archive. The International Association for Information and Data Quality (IAIDQ) also organises an annual conference. It should be noted that in the context of work on information quality there is no differentiation between data and information though there are initiatives, notably around ISO 8000 – 2011 where the emphasis is on master data management. The Association for Computing Machinery (ACM) publishes the Journal of Data and Information Quality but access is limited to ACM members. A good overview of the challenges of managing information as an enterprise asset (pdf download) is provided by Nina Evans and James Price, based in Australia.

The purpose of this post is to summarise some of the resources that are available in the area of information quality management. As I have mentioned above there are no quick fixes but information professionals should certainly ensure that they are aware of the substantial amount of work that has been published and is currently being undertaken.

Martin White


Enterprise search management as a ‘wicked problem’

In 1973 Horst Wittel and Melvin Webber authored a paper entitled ‘Dilemmas in a General Theory of Planning’ (Policy Sciences 4 (1973), 155-169). In this paper they set out the basis for what they regarded as ‘wicked problems’, which were beyond the capacity of traditional methods to resolve. In particular wicked problems cannot be addressed by a linear project management methodology because of the multi-dimensional nature of the problems that need to be resolved. Over the last few years a design thinking approach has been used with some success. Design thinking in management is a creative process, in which after gathering information (often through ethnographic techniques) the manager approaches problems through imagining possible solutions, rather than analysing the existing issue reductively. A key element in resolving wicked problems is that the leader’s role is in asking questions in order to help define the complexity of the problem facing the organisation and create conditions for ‘collective responsibility’ in addressing it, rather than the traditional expectation that they will offer a solution.

All too often I find that organisations are treating enterprise search as a project. At the end of the project the team is dispersed and gradually whatever quality was there at launch gradually fades away. The complexity of the workflow between the content being indexed and then found is rarely appreciated. If it doesn’t meet requirements then it must be the technology! In my experience that is very rarely the case.

I have created a table that looks at enterprise search as a ‘wicked problem’  Looking at the 16 elements of a wicked problem shows that traditional waterfall or even agile project approaches are totally unsuited to enterprise search applications. The requirement is to work as a team across multiple elements of an enterprise search implementation with a leader who has the experience to challenge and then work with a team to resolve an element. Even then there is a high probability that not all the elements can be resolved, which is why enterprise search applications need to be well supported with a search team post a nominal implementation. Earlier this week I was talking with Darron Chapman at CBResourcing, one of the most experienced recruitment consultants in the information and knowledge management sectors here in the UK. We agreed that the demand for experienced search managers was well in excess of supply and that salary requirements were very much on the high side. Organisations are now recognising that enterprise search is indeed a wicked problem and there are just not enough people around to solve all the problems. That raises another problem – where can people get a thorough training in enterprise search that is vendor-neutral and covers both commercial and open source applications?

Martin White

 


Organisation culture – what do the ‘buzz words’ actually mean?

Organisations like to embroider their internal and external communications with statements about their corporate culture and direction. “Unparalleled expertise across our wide range of solutions” comes from Gartner, just as an example to hand. So just what does ‘unparalled expertise’ mean, and how might it translate into other languages? In French ‘une expertise inégalée’ is close but not a strict translation. Last year I was working with a company with its headquarters in London but major offices around Europe and Asia. A substantial acquisition had taken place a couple of years prior to my engagement, and now that the dust had settled the communications team had decided that it was time for a new corporate message to be promoted. The team decided that the core term was the ‘bold’ steps that the company was taking. I had occasion to speak to several senior directors in Germany who were very upset by this decision, as the English concept of bold does not have a single direct German equivalent. The German words fett, mutig, kühn, fettgedruckt, dreist, and verwegen are all close but meant slightly different concepts.

Things get more complicated in companies headquartered in countries which do not have English as the local language. Multinational companies often use English as a lingua franca (ELF) but when it comes to abstract concepts like ‘bold’, ‘leading edge’ and ‘visionary’ should the words emerge from a discussion in the HQ national language or through a discussion in ELF? I’ve just been reading a very interesting case study of how a Norwegian company set about defining its corporate values, taking into account that it had subsidiaries in 10 countries. One of these countries was China and the case study has some very interesting quotes from both Norwegian and Chinese managers about the issue of communicating corporate values.

Intranet managers in multi-national companies would do well to read this case study, as it has implications for the extent to which ELF corporate values guidelines need to be carefully translated into other languages. In the case of the Norwegian company translations were made into German and Chinese for local purposes, but not into Norwegian because the company wanted to make a statement about its adoption of ELF even in Norway. For example Norwegian managers were not allowed to exchange emails in Norwegian with Norwegian colleagues working in overseas subsidiaries.

A conclusion from the case study is that multi-national companies should not develop culture statements in English and then rely on a translation into other languages. There should be a discussion with people speaking all the national languages present in the company (many of which HQ may well not be aware of!) so that the words selected can be rendered in these languages in a way that supports, rather than possibly negates, the corporate direction. Even if there is a close translation the very fact that the decision on the values was made by people speaking English as their mother language may send the wrong signals to a linguistically diverse workforce.

Martin White

 

 


Life inside a search lab – London 6 April

You can tell that when politicians walk around a laboratory they probably have no idea of what life is like inside one. I still have happy memories of discovering the solvent power of methylene dichloride and what happens when ether escapes from a leaky joint in a distillation retort. I’ve had a number of inquiries about what my Search Lab workshop is going to be like in April. Think of it as Applied Schadenfreude, a wonderful German word that means gaining pleasure from other people’s misfortune. Of course inside the enterprise this misfortune is difficult to assess. Have you noticed how rarely you ever see an intranet search application actually demonstrated at a conference, or even website search?

The moment we play with website search we usually become very aware of the misfortune of site visitors. But it’s one thing to see poor search implementation and another to understand how to fix it. The purpose of the Intranet Now Search Lab in London on 6 April is to provide you with a framework for assessing the quality of search performance and use it in action on a range of websites. To make the workshop work well we will have a number of pcs available so that you can work in small groups and then report back – much better than doing it all on a single big screen. Doing search hands-on enables you to try out various query and filtering options and build up a set of ‘search good practice’ notes for both your website and your intranet/enterprise search applications. I have a wonderful Black Museum of search implementations which defeat all my attempts to understand what the designers, developers and managers thought they were trying to achieve. It is rather like watching the adverts on television and trying to guess what the ‘creative’ conference must have been like to have come up with such a strange approach to customer communications. I also have some very good examples to demonstrate.  Of course the ideal approach is to offer up your own website for a trial run.

The Search Lab is run on the Chatham House rule that nothing said at the meeting can ever be attributed to a participant, so your secrets will still be secret for ever after the event. What else we will cover in the workshop is up to you. I started in search in 1975 and in 1980 worked with Unilever on the development of the first UK enterprise search application. So as well as the fun of playing with search you can also have fun in trying to stump the consultant. If you do that’s fine with me – I want to learn from the workshop and find out where my search weaknesses are. I’ll also bring along a collection of books on search and there will be a discount offer on my own book for participants.

Even if your own search application works perfectly please consider bringing the event to the attention of less fortunate colleagues in other organisations. They will be very grateful to you.

Martin White


Searching and Stopping: An Analysis of Stopping Rules and Strategies

During the search process, searchers need to decide when they should abandon the current query (and perhaps issue a new query after examining the current results list), and when to curtail their search by stopping the search session altogether. Knowing when to stop is considered a fundamental aspect of human behaviour.  Stop too early, and important information may be missed. Stop too late, and time and effort is wasted. Worse still, the examination of fruitless result lists will mean not having time to examine other lists which may potentially contain greater yields for the searcher. David Maxwell and Leif Azzopardi (School of Computing Science, University of Glasgow, UK) and Kalervo Järvelin and Heikki Keskustalo (School of Information Sciences, University of Tampere,Finland) have published a fascinating research paper on this topic. The four authors are in the very top echelon of IR research so what they have to say should be taken very seriously.

Two of the earliest stopping rules proposed were devised by W.S.Cooper in 1973 (search goes back a long way!) who proposed:

  • the frustration point rule, where a searcher stops after examining a certain number of non-relevant documents; and
  • the satisfaction stopping rule, where searchers would stop only when a certain number of relevant documents were found.examine other lists which may potentially contain greater yields for the searcher.

The frustration point rule is especially interesting. The authors define it as counting the number of non-relevant documents seen in the ranked list at position k. If the total number of non-relevant documents exceeded
a given threshold, the searcher would then stop. So if we have a personal rule that if we have got to the third page of results (say k = 30) and found few if any relevant results then we give up in frustration. Our time is too precious and we may, or may not, start again. I will do the authors a great disservice by jumping to the end of their paper but the main outcomes are that the two most common stopping strategies are

  • Fixed Depth. Under this stopping strategy, the simulated searcher will stop once they have observed a self-defined number of results snippets, regardless of their relevance to the given topic.
  • Contiguous Non-Relevant. The searcher will stop once they have observed (say) 5 non-relevant snippets in a row (contiguously).

The authors caution that a great deal more work needs to be done to understand these behaviours. However the research indicates that we may need to rethink our approaches to evaluating search success and search failure, at least taking into account search users strategies which may be internalised and pragmatic rather than just a function of relevance. It  would seem to put a priority on precision over recall but that might be taking the research too far. It would also have an impact on session time. Indeed it might be interesting to look at the variance of session times for search users and see if there are any patterns.  I should note that is paper was given at the ACM CIKM’15 Conference held in Melbourne, Australia,  in October 19 – 23, 2015 and so is only available to ACM members unless you are willing to pay an access fee.

Martin White