Smart search using Twitter Typeahead and Bloodhound

Typeahead bloodhound Tutorial
One of the basic things users would want to do, is search the content on a website. Having a crisp, fast and smart search that displays suggestions as one types, is something that everyone would love to have on their website!

The twitter typeahead library does just that! And when typeahead is used in conjunction with Bloodhound, it makes the the search experience even better!

But…. What exactly is Bloodhound? Bloodhound is a suggestion engine that offers several advanced functionalities like prefetching, smart caching, fast lookups, and backfilling with remote data.

Without wandering, lets quickly dive into the code to give you a clearer picture.

1. Setting up Typeahead and Bloodhound

i. Include the necessary scripts and basic html

This is simply done by including the typeahead.js and bloodhound.js into your html file. Typeahead depends on jQuery, so we include it too.
We’ll use the basic markup for a text box and assign it an idmy_search

<!-- index.html -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/corejs-typeahead/1.2.1/bloodhound.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/corejs-typeahead/1.2.1/typeahead.jquery.min.js"></script>

<!-- Basic markup for an input btn -->
<label>Search Colors:</label>
<input type="text" id="my_search" name="search" autocomplete="off" />

**Note: The original typeahead library has not been maintained since about ~3 years, but there is an actively maintained fork which I suggest should be instead. If you wish to use typeahead which has no dependency on jquery, we recommend typeahead-standalone which has a similar API

ii. Initialize Bloodhound

We create a Bloodhound instance which we will later pass as the source to the typeahead instance.
There are several configuration options that we pass to the Bloodhound Instance, of which the first 3 are required:

    • datumTokenizer: Function, Required. DatumTokenizer is a function which transforms the data(string/array/object) being searched through into an array of strings.
    • queryTokenizer: Function, Required. queryTokenizer is a function which converts the query string into an array of strings.
      There are a few predefined tokenizer functions(whitespace,nonword,..) which can be used as both DatumTokenizer and queryTokenizer. So we will be using the predefined function Bloodhound.tokenizers.whitespace() for both. An example of what tokenizer functions do is shown below:
Bloodhound.tokenizers.whitespace('  one  twenty-five');
// returns ['one', 'twenty-five']

Bloodhound.tokenizers.nonword('  one  twenty-five');
// returns ['one', 'twenty', 'five']
    • local/prefetch/Remote: Object, Required. This is the source of the suggestion data that is displayed to the user. There are 3 sources that can be used (individually or together):
      Local: Data comes from a local variable/array.
      Prefetch: Data is loaded from a given URL when Bloodhound initializes.
      Remote: Data is loaded from a remote source via a URL. (AJAX requests)

When remote is used in conjunction with local or prefetch, Bloodhound only makes network requests to load more suggestion data when the suggestions from local or prefetch fall short. Hence, local and prefetch could be used as a first-level cache.

The following bloodhound configuration options are optional:

    • identify: Function, Optional, Default: JSON.stringify(). Recommended to override

Given a datum(each suggestion item), this function is expected to return a unique id for it

    • sufficient: Integer, Optional, Default: 5.

If the number of available suggestions is less than sufficient, then remote source will be used to retrieve more results.

  • initialize: Function, Optional, Default: true. If set to false, the Bloodhound instance will not be implicitly initialized by the constructor function

For this first example, we will use “local” as the source. We pass an array consisting of color names to the local source.

// constructs the suggestion engine
var colors_suggestions = new Bloodhound({
  datumTokenizer: Bloodhound.tokenizers.whitespace, // see its meaning above
  queryTokenizer: Bloodhound.tokenizers.whitespace, // see its meaning above
  local: ['Red','Blood Red','White','Blue','Yellow','Green','Black','Pink','Orange']
});

iii. Initialize Typeahead

We initialise typeahead on the input element by selecting it via jquery $('#my_search').typeahead();
Typeahead accepts 2 objects as parameters:

    • Configuration options

The configuration options are completely optional, so you can even pass null or an empty object {} as the first parameter.

    • hint: Boolean, Optional, Default: true. If true, sets the first matched suggestion as the input’s placeholder.
    • highlight: Boolean, Optional, Default: false. If true, highlights the matched letters in the suggestions.
    • minLength: Integer, Optional, Default: 1. Defines the minimum length of the input string for which suggestions should be displayed.
    • Dataset options

The dataset options are mandatory as they point to the source of the suggestions data.

    • name: String, Optional. The name of the dataset. This name will be used as the classname of the containing DOM element
    • source: Function, Required. The source for the suggestions. This should be a function with the signature (query, syncResults, asyncResults). syncResults should be called with suggestions computed synchronously and asyncResults should be called with suggestions computed asynchronously (e.g. suggestions that come from an AJAX request). Source can also be a Bloodhound instance.
    • display: Function, Optional, Default: JSON.stringify(response). A function which should return the string representation of the suggestion. This is what appears in the input box after selecting one of the suggestions.
    • limit: Integer, Optional, Default: 5. The number of suggestions to be displayed.
    • templates: Object, Optional, Default: display(). It is an object that contains more objects representing the UI of the suggestion. The value can be a valid html string or a precompiled template. (A precompiled template is a function that generates html string from a JS object)
 templates: {
    notFound: '<div>Not Found</div>',   /* Rendered if 0 suggestions are available */ 
    pending: '<div>Loading...</div>',   /* Rendered if 0 synchronous suggestions available 
                                           but asynchronous suggestions are expected */
    header: '<div>Found Records:</div>',/* Rendered at the top of the dataset when 
                                           suggestions are present */
    suggestion:  function(data) {       /* Used to render a single suggestion */
                    return '<div>'+ data.name +'</div>'
                 },
    footer: '<div>Footer Content</div>',/* Rendered at the bottom of the dataset
                                           when suggestions are present. */
}

Once you have initialised bloodhound, you can pass it to typeahead as the source.

$('#my_search').typeahead({
  hint: true,
  highlight: true,
  minLength: 1
},
{
  name: 'colors',
  source: colors_suggestions   // Bloodhound instance is passed as the source
});

That’s it! You should have the following output as a result. Note that we have not used any CSS intentionally as it falls out of the scope of this tutorial. Feel free to style your fully functional smart search as you’d like!

2. Typeahead with Prefetch and Local

Prefetched data is fetched and processed on initialization. If the browser supports local storage, the processed data will be cached there to prevent additional network requests on subsequent page loads. Only the url configuration option is mandatory for prefetch to function.
Configuration options available are:

  • url: String, Required. The URL from where the prefetch data should be loaded.
  • cache: Boolean, Optional, Default: true. If false, will not use local storage and will always load data on initialization.
  • prepare: Function, Optional, Default: identify(). A function with signature prepare(settings) provides a hook to allow you to prepare the settings object passed to transport() when a request is about to be made. Settings is the default object prepared by Bloodhound. This function should return the settings object
  • transform: Function, Optional, Default: identify(). A function with the signature transform(response) that allows you to transform the prefetch response.

As an example, we will be using the URL https://raw.githubusercontent.com/twitter/typeahead.js/gh-pages/data/countries.json. The data returned is an array of strings (['Australia','China','Germany',...]),so we can directly use Bloodhound.tokenizers.whitespace() as the datumTokenizer. We will use the same function as the queryTokenizer too (You ask why? Because the user’s input is a string)

// Bloodhound with Local + Prefetch
var countries_suggestions = new Bloodhound({
    datumTokenizer: Bloodhound.tokenizers.whitespace,
    queryTokenizer: Bloodhound.tokenizers.whitespace,
    local: ['France','India'],
    prefetch: {
        url:'https://raw.githubusercontent.com/twitter/typeahead.js/gh-pages/data/countries.json',
        cache: true // defaults to true (so you can omit writing this)
    }
});

**Warning: Prefetched data isn’t meant to contain all your data. Rather, it should act as only as a cache. Ignoring this would cause you to run the risk of hitting local storage limits.

Now, if you put it all together, you should have the following –

3. Typeahead with Remote and Prefetch

Bloodhound only goes to the network when the internal search engine cannot provide a sufficient number of results. In order to prevent an obscene number of requests being made to the remote endpoint, requests are rate-limited.
Remote configuration options available are:

  • url: String, Required. The URL from where the remote data should be loaded.
  • prepare: Function, Optional, Default: identify(). A function with the signature prepare(query, settings) where query is the search term entered by the user and settings is the object created by Bloodhound. This function should return the settings object.
  • wildcard: String, Optional. If set, prepare() will be a function that replaces the value of the wildcard option in the url.
  • transform: String, Optional, Default: identify(). A function with the signature transform(response) that allows you to modify the remote response.

As an example, we will be using an API for countries to configure the remote URL. Data is returned as a JSON array of countries.
Example Response using https://restcountries.eu/rest/v2/alpha?codes=se

[
  {
    "name": "Sweden",
    "topLevelDomain": [".se",]
    "capital: "Stockholm",
    "latlng": [46,2],
    // more fields...
  }
]

As seen, we only care about the name field which gives us the name of the country. Since the response is a JSON array of objects, we will be using Bloodhound.tokenizers.obj.whitespace(name) as the datumTokenizer. The user’s input will always be a string, so we will use Bloodhound.tokenizers.whitespace() as the queryTokenizer.

The remote URL is https://restcountries.eu/rest/v2/alpha?codes=%QUERY where “%QUERY” should be the user’s input. So, we’ll go ahead and set the wildcard to be %QUERY.

// Bloodhound with Remote + Prefetch
var countries_suggestions = new Bloodhound({
    datumTokenizer: Bloodhound.tokenizers.obj.whitespace('name'),
    queryTokenizer: Bloodhound.tokenizers.whitespace,
    prefetch: {
        url:'https://raw.githubusercontent.com/twitter/typeahead.js/gh-pages/data/countries.json',
        transform: function (data) {          // we modify the prefetch response
            var newData = [];                 // here to match the response format 
            data.forEach(function (item) {    // of the remote endpoint
                newData.push({'name': item});
            });
            return newData;
        }
    },
    remote: {
        url: 'https://restcountries.eu/rest/v2/alpha?codes=%QUERY',
        wildcard: '%QUERY'                    // %QUERY will be replace by users input in
    },                                        // the url option.
});

// init Typeahead
$('#my_search').typeahead(
{
    minLength: 2,
    highlight: true
},
{
    name: 'countries',
    source: countries_suggestions,   // suggestion engine is passed as the source
    display: function(item) {        // display: 'name' will also work
        return item.name;
    },
    limit: 5,
    templates: {
        suggestion: function(item) {
            return '<div>'+ item.name +'</div>';
        }
    }
});

Now if you’ve followed everything correctly, you should have a working smart search box like the one shown below. If you try searching for “se”(country code for Sweden), you will see 3 results which are returned instantly by the prefetch. Since the number of suggestions is less than 5 (which is the default sufficient config option of Bloodhound), a request is made to the remote endpoint to get more results. The remote returns a response and then you can see that “Sweden” is automatically added to the suggested data list. Not just that, it also gets added to the cache so if you search again for “se”, you will see 4 results pop-up instantaneously providing your users an excellent Search Experience.

In the above example, we used the templates option to set the html of each suggestion. You can use better markup and styling to design the list of suggestions. As an example, check out Gospel Music to see how they have designed their search results using templates.

4. Showing Default Suggestions

So far we’ve seen that typeahead and bloodhound only kicks in with suggestions when you type something. But there may be some situations where you’d like to have some default suggestions shown. This can be achieved quite easily.

The only 2 differences in the code would be that when we initialize typeahead, the “source” would now be a function with the signature function(q, sync, async) instead of the bloodhound instance. Here, “q” is the query passed to the function, “sync” is the function that retrieves synchronized suggestions(suggestions already existing in the internal search index), “async” is a function that retrieves suggestions from the remote endpoint.
The second difference is the obvious one of setting the “minLength” to 0 so that the suggestions are displayed as soon as the user focuses on the search input.

// init Typeahead
$('#my_search').typeahead(
{
    minLength: 0,                      // set this to 0
    highlight: true
},
{
    name: 'countries',
    source: suggestionsWithDefaults,   // **custom function is passed as the source
    display: function(item) {        
        return item.name;
    },
    limit: 5,
    templates: {
        suggestion: function(item) {
            return '<div>'+ item.name +'</div>';
        }
    }
});
// our custom function
function suggestionsWithDefaults(q, sync, async) {
    if (q === '') {                   // if query is empty, show default suggestions
        sync([
        	{name:'France'}, 
        	{name:'Ireland'}
        ]);
    } else {
        /* countries_suggestions is the bloodhound instance
           as we used in the previous example */
        countries_suggestions.search(q, sync, async);
    }
}

If the search query is empty, we can get suggestions from our search index by calling sync([item1, item2, ...]). Note that these search suggestion datums(item1,item2,…) should be present in the search index (either via local or prefetch or by adding them later)
If the search query is not empty, we let bloodhound handle it by calling bloodhoundInstance.search(q, sync, async)

Live example

5. Add, Retrieve, Delete Suggestions from Bloodhound

Add Suggestions

Sometimes, you may need to add suggestions at a later time (unlike prefetch which adds suggestions on page load). To do this, all you need to do is call bloodhoundInstance.add() of the Bloodhound API.

// init Bloodhound
var countries_suggestions = new Bloodhound({ ... });
// add new items/datums to the internal search index
countries_suggestions.add([
    {name:'Gotham'},
    {name:'Arkham'}
]);

That’s it! Post this, the newly added items would be displayed as suggestions as well.

Retrieve Suggestions

To retrieve suggestions, you could use bloodhoundInstance.get() of the Bloodhound API.

// init Bloodhound
var countries_suggestions = new Bloodhound({ ... });
// get items/datums from the internal search index
countries_suggestions.get(['France', 'Germany']);
// returns [{'name':'France'},{'name':'Germany'}]

Delete Suggestions

Although it is not possible to remove individual suggestions from the search index, you can reset/clear the internal search index using #clear from the API as BloodhoundInstance.clear()

6. Displaying Loader while loading Suggestions

There are 2 ways of doing this

1. Via Typeahead Templates

The Typeahead API provides a “pending” template to be set which is rendered only when asynchronous suggestions(i.e. suggestions loaded via remote endpoint) are expected.

// init Typeahead
$('#my_search').typeahead(
{
    minLength: 2,
    highlight: true
},
{
    name: 'countries',
    source: bloodhoundInstance,
    display: 'name',
    limit: 5,
    templates: {
        suggestion: function(item) {
            return '<div>'+ item.name +'</div>';
        },
        pending: function (query) {
            return '<div>Loading...</div>';
        }
    }
});

2. Using hooks inside Bloodhound

You could include your loader animation via CSS. For example, lets say you display a loader image in a div element and assign it an id search_loader which is hidden initially.
Bloodhound gives you access to the prepare method before it makes a request to the remote endpoint. This hook can be used to display the loader. This hook is usually used to modify the settings object.

Once Bloodhound loads data from the remote endpoint, it allows you to filter results via the filter method. We will use this hook to simply hide the loader.

// html
<div id="search_loader"></div>
// css
#search_loader {
    background-image: url("path-to-image.gif");
    display: none;
}
// js
// init Bloodhound
var countries_suggestions = new Bloodhound({
    datumTokenizer: Bloodhound.tokenizers.obj.whitespace('name'),
    queryTokenizer: Bloodhound.tokenizers.whitespace,
    remote: {
        url: 'https://restcountries.eu/rest/v2/alpha?codes=',
        prepare: function (query, settings) {
            $("#search_loader").fadeIn();     // display loader
            settings.url = this.url + query;
            return settings;
       },
       filter: function (data) {
           $("#search_loader").fadeOut();    // hide loader
           return data;
       }
    },
    identify: function (response) {
        return response.name;
    }
});

The official docs may seem a bit convoluted initially. But if you’ve gone through this article, you will now be able to expound on those with utmost clarity. Hope this helps 🙂

More Resources

Leave a Reply