What can you do with it?

Find the most similar strings from a list by comparing them to a search term. This is useful for fuzzy matching, finding near-duplicates, or ranking items by syntactic similarity. Like if someone typed in “I want to see Doctor Shallsy” and the Dr’s name is actually “Shellzy”, you could use this skill to find the matching doctor from a list of options.

How does it work?

The similarity search uses Levenshtein distance to measure how similar two strings are. This algorithm calculates the minimum number of single-character edits (insertions, deletions, or substitutions) needed to transform one string into another.

Distance Scoring:

  • Distance 0: Identical strings
  • Distance 1: One character different (“cat” vs “bat”)
  • Distance 2: Two edits needed (“cat” vs “dog”)
  • Lower scores = more similar

Smart Extraction: When enabled, smart extraction focuses on meaningful parts of text by:

  1. Splitting text using the specified character (default: /)
  2. Excluding common patterns like version numbers, dates, file extensions
  3. Comparing only the relevant extracted parts

Example: Instead of comparing entire file paths, it extracts just the meaningful filename parts:

  • "documents/reports/quarterly_sales_report_2024_v2.pdf""quarterly sales report"
  • "documents/archive/quarterly_sales_report_2023_final.pdf""quarterly sales report"
  • Result: Distance 0 (identical after extraction)

How to use it?

Basic Command Structure

/similarity-search
search term: the string you want to find matches for
items to search: array of strings to compare against

Parameters

Required:

  • search term - The string you want to find matches for

  • items to search - Array of strings to search through

Optional:

  • use smart extraction - Extract and compare only the most relevant parts of paths (default: false)

  • exclude patterns - Patterns to exclude during smart extraction (default: none)

  • split value - Character used to split paths when using smart extraction (default: /)

  • max results - Maximum number of results to return (default: 5)

Response Format

The command returns:

A ranked list of matches with their similarity distances (lower distance = more similar)

Examples

Basic Usage

/similarity-search
search term: lpsum string to compare
items to search: ["first lpsum string", "another possible match", "completely different text", "lpsum sample"]
max results: 3

Find similar product names from a list

Advanced Usage

/similarity-search
search term: quarterly sales report
items to search: ["documents/reports/quarterly_sales_report_2024_v2.pdf", "documents/reports/monthly_sales_summary_2024_final.pdf", "documents/reports/quarterly_financial_report_2024_draft.pdf", "documents/archive/quarterly_sales_report_2023_final.pdf"]
use smart extraction: true
exclude patterns: ["2024", "2023", "v2", "final", "draft"]
split value: /

Find similar document names by extracting meaningful parts from file paths, ignoring version numbers and dates

Specific Use Case

/similarity-search
search term: John Smith
items to search: ["Jon Smith", "John Smyth", "Jane Smith", "John Schmidt", "Johnny Smith"]
max results: 5

Find similar customer names in a database for deduplication