Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Building an Autocompletion Engine: A Practical Guide to Tries and Search Algorithms (CS/COE 1501 Inspired)

Learn how to implement an autocompletion engine using a De La Briandais trie and a user history symbol table. This tutorial covers search algorithms, prefix matching, and performance measurement, with practical examples inspired by mobile typing and modern AI tools.

autocompletion engine DLB trie search algorithms symbol tables prefix matching autocomplete tutorial Java trie implementation user history persistence performance measurement nanoTime computer science project data structures predictive text mobile keyboard algorithm AI autocomplete CS 1501

Introduction: The Power of Autocomplete in Modern Apps

Autocomplete is everywhere—from your phone's keyboard to Google Search, from AI chatbots like ChatGPT to code editors like VS Code. It saves time and reduces typos by predicting what you're about to type. In this tutorial, we'll build a simplified autocompletion engine using a De La Briandais (DLB) trie and a custom symbol table for user history. This project is inspired by typical CS/COE 1501 assignments and will deepen your understanding of search algorithms and symbol tables.

By the end, you'll have a working program that reads a dictionary, accepts character-by-character input, and suggests up to 5 predictions based on both the dictionary and past user entries. You'll also measure prediction time using System.nanoTime() and persist user history across runs.

Why Tries? The DLB Advantage

A trie (prefix tree) is ideal for prefix-based searches. The DLB trie is a space-efficient variant that uses nodes with multiple child pointers, often implemented as a linked list per node. It's perfect for autocomplete because:

  • Fast prefix lookup: O(k) where k is the length of the prefix.
  • Memory efficient for English dictionaries (around 100k words).
  • Easy to retrieve all words with a given prefix.

This data structure is the backbone of many autocomplete systems in real-world applications like mobile keyboards and search engines.

Step 1: Loading the Dictionary into a DLB Trie

First, download the English dictionary from the provided link (or any word list). Create a DLB trie class with methods add(String word) and getWordsWithPrefix(String prefix). The latter should return a list of words (up to 5) that start with the given prefix.

public class DLB {
    private Node root;
    
    private class Node {
        char data;
        Node sibling, child;
        boolean isWord;
    }
    
    public void add(String word) {
        // Insert character by character
    }
    
    public List<String> getWordsWithPrefix(String prefix) {
        // Traverse to prefix end, then DFS to collect words
        List<String> results = new ArrayList<>();
        collect(root, prefix, results, 5);
        return results;
    }
}

When your program starts, read dictionary.txt line by line and add each word to the trie. This may take a few seconds, but it's a one-time cost.

Step 2: User History Symbol Table

To personalize predictions, you need to track what the user has typed before. The assignment leaves the design up to you. A good approach is a trie-of-tries or a hash map of prefix -> list of (word, frequency). We'll use a prefix hash map for simplicity:

Map<String, Map<String, Integer>> history = new HashMap<>();
// key: prefix (e.g., "the"), value: map of words starting with that prefix and their counts

On each user input, if the prefix exists in history, sort the words by frequency and return up to 5. If fewer than 5, fill the rest from the dictionary trie.

To persist history, serialize this map to user_history.txt (e.g., using Java serialization or a simple CSV format) and load it on startup.

Step 3: Interactive Loop and Prediction Logic

The main program loop reads characters one by one. For each character, it builds the current prefix, then:

  1. Checks history for previously entered completions for that prefix.
  2. If not enough, queries the dictionary trie for additional suggestions.
  3. Prints up to 5 predictions with a number (1-5).
  4. If user enters a number, selects that prediction and updates history.
  5. If user enters '$', completes the word as typed (even if not in dictionary) and adds to history.
  6. If user enters '!', exits and prints average prediction time.

Use System.nanoTime() before and after prediction to calculate time in seconds.

Example Run (Inspired by the Assignment)

Enter your first character: t
(0.000251 s) Predictions: (1) t (2) ta (3) tab (4) tab's (5) tabbed
Enter the next character: h
(0.000159 s) Predictions: (1) thalami (2) thalamus (3) thalamus's (4) thalidomide (5) thalidomide's
...

Notice how after the user selects a word, that word appears first in predictions for the same prefix in subsequent runs (persisted history).

Performance Considerations and Trend Connections

Autocomplete engines are critical in AI assistants like ChatGPT and GitHub Copilot. They must be fast—often responding in milliseconds. Measuring with System.nanoTime() helps you optimize. In our example, times are around 0.0001-0.0002 seconds, which is typical for a well-implemented trie.

Think about scaling: if you had millions of users, you'd need a distributed system. But for a school project, a local trie and history map are sufficient.

Common Pitfalls and Tips

  • Case sensitivity: The assignment specifies case-sensitive matching. Don't convert to lowercase.
  • History persistence: Save history on exit (when '!' is pressed). Load it at startup.
  • Empty predictions: If no suggestions, display a message and let the user continue typing until '$'.
  • Number input: When user types '1'-'5', complete that word and add it to history.

Conclusion

You've built a functional autocompletion engine using a DLB trie and a custom history symbol table. This project reinforces key concepts in search algorithms, data structures, and user interface design. The skills you gain are directly applicable to modern software development, from mobile keyboards to AI-driven code completion.

Now go ahead and implement it in Java. Remember to name your main file ac_test.java and include an approach.txt explaining your design choices. Happy coding!