Relevant Search – Doug Turnbull and John Berryman

The user requirement for a successful search is very easy to state. They want the items that are most relevant to their query to appear on the first page (or at worst the first two pages!) of results. Delivering this requirement is a far greater challenge than users and search managers imagine. The very fact that Relevant Search, written by Doug Turnbull and John Berryman, runs to over 330 pages gives an immediate illustration of the scope and scale of relevance management. I often use the metaphor of looking at an automobile engine. In principle we all know how the engine works but when it doesn’t work to perfection all we can do is look at the collection of modules and wires and wonder just what we have to do to restore the performance. That’s when an engineer with plug-in diagnostic equipment is essential. They can not only spot the problem but also know the systems well enough to sort out the problem.

The reason for presenting this metaphor is that the authors have written this book for relevance engineers. This to me is a new job profile but one that I can immediately relate to. The book presents all that a relevance engineer requires to understand how to go about improving relevancy, and this requires a good knowledge of information retrieval principles and also how these principles are best translated into software code. I should state up front that the examples in the book show code from Elasticsearch or Solr open source software but that should not be seen as limiting the book to open source implementation. Indeed seeing the code will help the reader understand what is going on in any enterprise search application. After all SharePoint 2010/2013 uses the same BM25 ranking model that is now in Lucence v6.

The eleven chapters in the book cover debugging a relevance problem, understanding the role of tokens, basic multi-field search, term-centric search, shaping the relevance function, providing relevance feedback, designing a relevance-focused search application, the relevance centered enterprise and advanced search techniques. There is no other book that I know of that manages to integrate both information retrieval and search management so successfully, with just enough IR fundamentals to show the origin of a relevance problem and the basis for a solution which can be expressed in code.  I especially value the way in which the examples are based on a ‘real’ collection of information, the Movie Database. Since we all have a familiarity with movies this for me makes the book come alive.

The quality of the content is not matched by the quality of the publishing format. This review is based on the e-book version. Although there is a list of sub-headings on the PDF version the lack of an index makes it almost impossible to dip into the book to find an explanation of a feature or a solution to problem. The writing style is very conversational but this results in a lot of words with apostrophes, often where they are not needed. Overall the copy editing is patchy.

I cannot recommend this book strongly enough. It is certainly not just for ‘developers’. Search managers, and of course relevance engineers, need to appreciate the fundamentals of search technology and good practice in relevance management even if they are working with commercial applications. Students on computer and information science courses will also find it of great value and hopefully be inspired to follow a career in relevance engineering.  All I missed was a consideration of relevance management in federated search implementations, but I’m sure that the authors are saving this for the next edition.

Martin White

Intranet Content Migration – a guide to good practice

Intranets and content management software (CMS) applications both have service lifetimes of probably 4-5 years although this can sometimes be extended with strong initial and ongoing implementation. Intranet teams will have the experience and expertise needed to develop an upgraded intranet on an existing CMS but will rarely have the experience to migration to a new CMS, especially where there is a requirement to introduce a new information architecture, to clean up the amount and quality of the content and perhaps implement a new search application. As a result planning and executing an intranet content migration project become a very considerable challenge.

Intranet Content Migration is co-authored with David Hobbs  a leading authority on website content migration. As far as we are aware this is the first briefing paper to be published specifically on intranet migration. We have set out to present what in our experience is good practice for intranet content migration based on some major projects we undertook individually and together in 2014 and 2015.  Although the principles are similar to website content migration there are a number of specific technical and governance challenges that need to be addressed. Particular attention is paid to the benefits of undertaking a comprehensive planning process ahead of the commencement of migration, focusing on a content inventory process that enables informed decisions to be made on the amount of content that needs to be migrated, and the extent to which this can be accomplished using content rules rather than a time-consuming inspection and migration of each content item.

Other topics covered in this Research Note are the importance of effective risk management, the need to work through the implications for the search application for the intranet, the requirement to have a well-designed and supported communications programme and the importance of deciding how the progress of migration will be reported. Appendices list a set of ten critical success factors and some additional resources on content migration.

Martin White

The Organisation in the Digital Age – 2016 survey now open

Much of my career has been in the B2B market research business, notably with International Data Corporation and then Logica. The IT sector has always been awash with research reports from vendors seeking to justify their market position and pricing, as well as many boutique companies offering high quality research in a small sector. The value of the IDC and Logica services was that each year they used the same core methodology to highlight trends in market growth over a five year period and yet included questions in the survey which took account of recent developments. It was hard work.

All the more remarkable then that this year Jane McConnell is working solo on the 10th of her annual surveys, which started out with intranets and now assess the extent to which organisations are making a commitment to working digitally, This year the survey for the Organisation in the Digital World report is in two parts.  The Core part (59 questions), streamlined from previous years, takes approximately 30 minutes. The optional Extended part (37 questions) is for organizations that want to do a deeper dive into their digital transformation. All participants receive a copy of the final report The Organization in the Digital Age 2016 (Core or Extended), as well as the Scorecard for their organization, which is optional and free.

The innovations this year are a customised snapshot report and sponsorship opportunities for research supporters. The snapshot report is available to organisation who are able to arrange for six or more people to complete the survey. They receive 3-page summary of the consolidated results providing a snapshot from different viewpoints: functions,  business lines, or countries depending on the role of the respondents. This year vendors, digital agencies, technology and service providers, and others can participate as a Research Supporter through a sponsorship package. This brings visibility in the report, and a chance to communicate their messages to a high-potential audience.

Although the benefits to organisations of having a global perspective on digital workplace adoption is significant I know that many organisations welcome the opportunity to use the survey as a means of bringing together their digital leaders to exchange views on how adoption is taking place in specific departments and divisions. Even if the team only spend a morning together to complete the survey the near-term and long-term benefits will be substantial. I have seen too many organisations in which digital innovations are its best kept secrets! The publication later this year of both this survey and the Findwise Findability Survey will once again provide us with dependable insights into the level of commitment to digital working that can be used in planning for 2017 and beyond.

Martin White


Defining and managing information quality

For the last three years I have been supporting major projects that involve content migration and enterprise search. A primary objective of both migration and search is ‘to improve information quality’ but in the projects I have been involved with little attention has been paid to defining the parameters or information quality and putting in place policies and processes to improve quality. The reason for not doing so is that the staff resources required are significant, and because there is no corporate commitment by the organisation to information quality it is all but impossible to gain the support required to at least start the journey towards information quality improvement. It is indeed a journey; there are no quick fixes.

In general organisations seem unaware of the significant amount of work that has been undertaken in to defining information quality standards and guidelines, dating back to pioneering work at MIT in the early 1990s that recognized information had to be fit for purpose and not just ‘accurate’. A very good resource on the development of information quality management is a book entitled The Philosophy of Information Quality, published by Springer in 2014. This book is a collection of contributions on all aspects of data and information quality edited by Luciano Floridi and Phyllis Illari. The quality of the contributions is very high but for some unaccountable reason there is no index to the book. Springer clearly does not have a commitment to information quality!  A similar book on Data and Information Quality is about to be published by Springer, and it will be interesting to see if an index is provided. There is an earlier book on Managing Information Quality from Springer which was published in 2006.

MIT remains at the heart of information quality management. It organises an annual conference, which in 2016 takes place in Spain from 22-23 June. The papers from previous conferences can be downloaded from the conference archive. The International Association for Information and Data Quality (IAIDQ) also organises an annual conference. It should be noted that in the context of work on information quality there is no differentiation between data and information though there are initiatives, notably around ISO 8000 – 2011 where the emphasis is on master data management. The Association for Computing Machinery (ACM) publishes the Journal of Data and Information Quality but access is limited to ACM members. A good overview of the challenges of managing information as an enterprise asset (pdf download) is provided by Nina Evans and James Price, based in Australia.

The purpose of this post is to summarise some of the resources that are available in the area of information quality management. As I have mentioned above there are no quick fixes but information professionals should certainly ensure that they are aware of the substantial amount of work that has been published and is currently being undertaken.

Martin White

Enterprise search management as a ‘wicked problem’

In 1973 Horst Wittel and Melvin Webber authored a paper entitled ‘Dilemmas in a General Theory of Planning’ (Policy Sciences 4 (1973), 155-169). In this paper they set out the basis for what they regarded as ‘wicked problems’, which were beyond the capacity of traditional methods to resolve. In particular wicked problems cannot be addressed by a linear project management methodology because of the multi-dimensional nature of the problems that need to be resolved. Over the last few years a design thinking approach has been used with some success. Design thinking in management is a creative process, in which after gathering information (often through ethnographic techniques) the manager approaches problems through imagining possible solutions, rather than analysing the existing issue reductively. A key element in resolving wicked problems is that the leader’s role is in asking questions in order to help define the complexity of the problem facing the organisation and create conditions for ‘collective responsibility’ in addressing it, rather than the traditional expectation that they will offer a solution.

All too often I find that organisations are treating enterprise search as a project. At the end of the project the team is dispersed and gradually whatever quality was there at launch gradually fades away. The complexity of the workflow between the content being indexed and then found is rarely appreciated. If it doesn’t meet requirements then it must be the technology! In my experience that is very rarely the case.

I have created a table that looks at enterprise search as a ‘wicked problem’  Looking at the 16 elements of a wicked problem shows that traditional waterfall or even agile project approaches are totally unsuited to enterprise search applications. The requirement is to work as a team across multiple elements of an enterprise search implementation with a leader who has the experience to challenge and then work with a team to resolve an element. Even then there is a high probability that not all the elements can be resolved, which is why enterprise search applications need to be well supported with a search team post a nominal implementation. Earlier this week I was talking with Darron Chapman at CBResourcing, one of the most experienced recruitment consultants in the information and knowledge management sectors here in the UK. We agreed that the demand for experienced search managers was well in excess of supply and that salary requirements were very much on the high side. Organisations are now recognising that enterprise search is indeed a wicked problem and there are just not enough people around to solve all the problems. That raises another problem – where can people get a thorough training in enterprise search that is vendor-neutral and covers both commercial and open source applications?

Martin White


Organisation culture – what do the ‘buzz words’ actually mean?

Organisations like to embroider their internal and external communications with statements about their corporate culture and direction. “Unparalleled expertise across our wide range of solutions” comes from Gartner, just as an example to hand. So just what does ‘unparalled expertise’ mean, and how might it translate into other languages? In French ‘une expertise inégalée’ is close but not a strict translation. Last year I was working with a company with its headquarters in London but major offices around Europe and Asia. A substantial acquisition had taken place a couple of years prior to my engagement, and now that the dust had settled the communications team had decided that it was time for a new corporate message to be promoted. The team decided that the core term was the ‘bold’ steps that the company was taking. I had occasion to speak to several senior directors in Germany who were very upset by this decision, as the English concept of bold does not have a single direct German equivalent. The German words fett, mutig, kühn, fettgedruckt, dreist, and verwegen are all close but meant slightly different concepts.

Things get more complicated in companies headquartered in countries which do not have English as the local language. Multinational companies often use English as a lingua franca (ELF) but when it comes to abstract concepts like ‘bold’, ‘leading edge’ and ‘visionary’ should the words emerge from a discussion in the HQ national language or through a discussion in ELF? I’ve just been reading a very interesting case study of how a Norwegian company set about defining its corporate values, taking into account that it had subsidiaries in 10 countries. One of these countries was China and the case study has some very interesting quotes from both Norwegian and Chinese managers about the issue of communicating corporate values.

Intranet managers in multi-national companies would do well to read this case study, as it has implications for the extent to which ELF corporate values guidelines need to be carefully translated into other languages. In the case of the Norwegian company translations were made into German and Chinese for local purposes, but not into Norwegian because the company wanted to make a statement about its adoption of ELF even in Norway. For example Norwegian managers were not allowed to exchange emails in Norwegian with Norwegian colleagues working in overseas subsidiaries.

A conclusion from the case study is that multi-national companies should not develop culture statements in English and then rely on a translation into other languages. There should be a discussion with people speaking all the national languages present in the company (many of which HQ may well not be aware of!) so that the words selected can be rendered in these languages in a way that supports, rather than possibly negates, the corporate direction. Even if there is a close translation the very fact that the decision on the values was made by people speaking English as their mother language may send the wrong signals to a linguistically diverse workforce.

Martin White



Life inside a search lab – London 6 April

You can tell that when politicians walk around a laboratory they probably have no idea of what life is like inside one. I still have happy memories of discovering the solvent power of methylene dichloride and what happens when ether escapes from a leaky joint in a distillation retort. I’ve had a number of inquiries about what my Search Lab workshop is going to be like in April. Think of it as Applied Schadenfreude, a wonderful German word that means gaining pleasure from other people’s misfortune. Of course inside the enterprise this misfortune is difficult to assess. Have you noticed how rarely you ever see an intranet search application actually demonstrated at a conference, or even website search?

The moment we play with website search we usually become very aware of the misfortune of site visitors. But it’s one thing to see poor search implementation and another to understand how to fix it. The purpose of the Intranet Now Search Lab in London on 6 April is to provide you with a framework for assessing the quality of search performance and use it in action on a range of websites. To make the workshop work well we will have a number of pcs available so that you can work in small groups and then report back – much better than doing it all on a single big screen. Doing search hands-on enables you to try out various query and filtering options and build up a set of ‘search good practice’ notes for both your website and your intranet/enterprise search applications. I have a wonderful Black Museum of search implementations which defeat all my attempts to understand what the designers, developers and managers thought they were trying to achieve. It is rather like watching the adverts on television and trying to guess what the ‘creative’ conference must have been like to have come up with such a strange approach to customer communications. I also have some very good examples to demonstrate.  Of course the ideal approach is to offer up your own website for a trial run.

The Search Lab is run on the Chatham House rule that nothing said at the meeting can ever be attributed to a participant, so your secrets will still be secret for ever after the event. What else we will cover in the workshop is up to you. I started in search in 1975 and in 1980 worked with Unilever on the development of the first UK enterprise search application. So as well as the fun of playing with search you can also have fun in trying to stump the consultant. If you do that’s fine with me – I want to learn from the workshop and find out where my search weaknesses are. I’ll also bring along a collection of books on search and there will be a discount offer on my own book for participants.

Even if your own search application works perfectly please consider bringing the event to the attention of less fortunate colleagues in other organisations. They will be very grateful to you.

Martin White

Searching and Stopping: An Analysis of Stopping Rules and Strategies

During the search process, searchers need to decide when they should abandon the current query (and perhaps issue a new query after examining the current results list), and when to curtail their search by stopping the search session altogether. Knowing when to stop is considered a fundamental aspect of human behaviour.  Stop too early, and important information may be missed. Stop too late, and time and effort is wasted. Worse still, the examination of fruitless result lists will mean not having time to examine other lists which may potentially contain greater yields for the searcher. David Maxwell and Leif Azzopardi (School of Computing Science, University of Glasgow, UK) and Kalervo Järvelin and Heikki Keskustalo (School of Information Sciences, University of Tampere,Finland) have published a fascinating research paper on this topic. The four authors are in the very top echelon of IR research so what they have to say should be taken very seriously.

Two of the earliest stopping rules proposed were devised by W.S.Cooper in 1973 (search goes back a long way!) who proposed:

  • the frustration point rule, where a searcher stops after examining a certain number of non-relevant documents; and
  • the satisfaction stopping rule, where searchers would stop only when a certain number of relevant documents were found.examine other lists which may potentially contain greater yields for the searcher.

The frustration point rule is especially interesting. The authors define it as counting the number of non-relevant documents seen in the ranked list at position k. If the total number of non-relevant documents exceeded
a given threshold, the searcher would then stop. So if we have a personal rule that if we have got to the third page of results (say k = 30) and found few if any relevant results then we give up in frustration. Our time is too precious and we may, or may not, start again. I will do the authors a great disservice by jumping to the end of their paper but the main outcomes are that the two most common stopping strategies are

  • Fixed Depth. Under this stopping strategy, the simulated searcher will stop once they have observed a self-defined number of results snippets, regardless of their relevance to the given topic.
  • Contiguous Non-Relevant. The searcher will stop once they have observed (say) 5 non-relevant snippets in a row (contiguously).

The authors caution that a great deal more work needs to be done to understand these behaviours. However the research indicates that we may need to rethink our approaches to evaluating search success and search failure, at least taking into account search users strategies which may be internalised and pragmatic rather than just a function of relevance. It  would seem to put a priority on precision over recall but that might be taking the research too far. It would also have an impact on session time. Indeed it might be interesting to look at the variance of session times for search users and see if there are any patterns.  I should note that is paper was given at the ACM CIKM’15 Conference held in Melbourne, Australia,  in October 19 – 23, 2015 and so is only available to ACM members unless you are willing to pay an access fee.

Martin White

Managing Expectations – Building Client-Consultant Partnerships

I’ve had a long and very enjoyable career as a consultant, dating back to 1979 and encompassing around 500 projects. Certainly I’ve undertaken over 100 as Intranet Focus. The most enjoyable projects are those where the challenges push me to the limit of my experience and expertise and where right from the start I have been able to build a strong partnership with the client. The International Monetary Fund comes immediately to mind, where the major challenge I faced was the tragic impact of 9/11 just two days after the start of the project. The partnership that was forged led to the on-time delivery of the project reports, an invitation a few years later to undertake a high-level project at the United Nations HQ in New York and subsequently a project for the World Bank.

As well as being a consultant I have also been the ‘client’ in a number of major consulting projects. Building an effective partnership requires a significant contribution from a client, who may not have worked with consultants before and who may also be concerned that the consultant will be critical of their work. There are many books about how to be a consultant but virtually none that offer advice to people who have no consulting experience on how to get the best out of a consultant. Intranet managers often need to bring in additional specialised advice because they are often a team of one. The 2016 Nielsen Norman Group Intranet Design Annual lists out over 20 areas of external expertise that between them the award winners used in the development of their intranets. Kara Pernice kindly allowed me to include this list in the book. Managing this array of consultants and contractors is a substantial challenge to even the most experienced of intranet managers.

Late last year I suggested to Kristian Norling that this might be a good subject for his growing range of books. He had the same opinion and in the space of three months the book was written and then launched at the IntraTeam event last week. It  covers how to write a request for consulting support, select a consultant, start up a project, and ensure that the outcomes are in line with the expectations. Of course these expectations can change in the course of the project, often for very good reason, and that was the reason for the title of the book. At the end of the book are ten critical success factors, and in addition ten factors that can quickly derail any consulting project. You can order the 125page book from Intranatverk or from Amazon. Comments on the book would be greatly appreciated.

Working with Kristian was a great pleasure, and we quickly built the sort of author-publisher partnership that is a model for a client-consultant partnership. The end result is a book that I am very proud of delivering to schedule despite some changes to the contents and structure along the way as Sam Marshall, Jane McConnell and Sandra Ward challenged some of my initial ideas. My thanks to all of them.

Martin White

IntraTeam 2016 – conference overview

The first thing I noticed when I registered for IntraTeam 2016 last week in Copenhagen was how many people there were around the conference area. This year over 200 delegates turned up, the best attendance since 2012. Tuesday was workshop day. I was giving a workshop on content migration in the afternoon so took the opportunity to participate in a social business adoption workshop led by Luis Suarez (@elusa) which was exceptionally interesting. For the first time I understood what ‘working out loud’ was all about and apart from much else gained some new perspectives on expertise identification and exchange. There was also a warning against the use of gamification. This was especially welcome as I often seem to be the only intranet consultant that sees little value in this approach.

A number of people have already blogged about the conference, notably Sam Marshall and Wedge. You can track comments and downloadable versions of papers at #IEC16. As well as presenting a paper on information risk while dressed in a heavy anorak and gloves (it’s a long story!) I took part in a very good final review session with Luis and Nina Nikolaisen (COWI) in which we tried to summarise the themes and outcomes of the conference. There certainly seemed to be confidence in the air, with a number of large scale intranet and search projects being presented. The Skanska and Robin Partington and Partners intranets were miles apart in scale but not in how well they had been aligned with business requirements. The majority of the intranets were search-driven, and it was interesting to see that the search sessions had to be held in the main conference room as there was not enough space in the other rooms. Other important themes included the management of social networks (including a very polished presentation from Shell) and knowledge management/social business. As always time-keeping and the overall organisation were spot on. There was a neat presentation in one corner of the room set up by Mads Møller of over 40 examples of home pages, each with a pie chart of the balance between News, Tools, Library and Collaboration, which stimulated a lot of discussion.

Although the delegate base was large the number of delegates from outside Denmark and Sweden was quite small. For the UK it could be that Intranet Now is providing a domestic forum, with the 2016 Intranet Now event taking place on 30 September. More details on that very soon. Because most of the delegates are already IntraTeam community members there was a very good buzz around the conference, and about a third of the audience (on a show of hands) had been to three or more previous events. It also meant that in the presentations and in discussions over coffee and pastries delegates were as ready to share the problems they had experienced as well as the successes. All that I missed was any consideration of content quality. At the end of the event I was interviewed for a video that no doubt will be used to promote IntraTeam 2017 (28 Feb-2 March). I was asked if I had met any interesting people. Without thinking about it I responded that everyone I had met was interesting, and on reflection some days later that is a good summary of my experience at the event. My thanks go to Kurt and his team for the invitation to participate.

Martin White