Adding shortcuts to your search engine

3 minute read

I really like DuckDuckGo’s bangs, which basically directs your query elsewhere if you prefix it with !something. You could search the London zoo in gmaps by querying for !gmaps London zoo, or open the wiki article on Mass Effect with !wiki mass effect. There’s a whole bunch of them and they just keep adding more.
I like them because they require zero configuration on my browser, and they work identically on all of them.

I wanted to add some of my own, but my custom bangs would be of no relevance to anyone else. They’d point to my custom finance tracker, my company’s CRM, my private network’s logger. I don’t want to share those with DDG, and they wouldn’t want any of them. Instead, I created my own search engine. Sort of.

Design

I want to write a “search engine”, where if the query matches soemthing like !server best_function, it’ll redirect the request to a search for best_function on my “server” repo. If the query doesn’t match any of my bangs, it’ll redirect the request to my usual search engine.
It’s simple (regex and redirects) and stateless. My main requirement is for it to be reasonably fast, so I went with PHP.
To make it fancy, I decided to support search suggestions, which is a bit more complicated than a straight-up search engine. As a handicap, I wanted to keep it all in the same file.

Main logic

A better person would use a proper class for storing the bangs’ configs, but I just used a good-ole array of dicts:


$engines = [
  [
    'bangs' => ['server', 'ser'],
    'search_url' => 'https://git.private-repo.com/search?q={{{s}}}',
  ],
];

$default_bang = [
  'search_url' => "https://duckduckgo.com/?q={{{s}}}",
  'suggestion_url' => "https://duckduckgo.com/ac/?q={{{s}}}&type=list",
];

Finding the right dict for a specific bang is rather primitive:


function find_bang($bang) {
  global $engines;

  foreach ($engines as $engine) {
    foreach ($engine['bangs'] as $candidate_bang) {
      if ($candidate_bang == $bang) {
        return $engine;
      }
    }
  }
  return null;
}

Extracting the bang from the querystring is done with basic regex:


function extract_bang($q) {
  if(preg_match('/^!(?<bang>[\S]+)(\s+(?<rest>.*))?/', $q, $matches)) {
    $bang = $matches['bang'];
    $rest = $matches['rest'];
    return ['bang'=>$bang, 'rest'=>$rest];
  }

  return ['bang'=>null, 'rest'=>$q];
}

Redirection is simple:


function redirect($base_url, $q) {
  $url = str_replace('{{{s}}}', urlencode($q), $base_url);
  header('Location: '.$url);
  exit;
}

And putting it all together:


function full_monty($q, $field_name) {
  global $default_bang;

  $query_data = extract_bang($q);
  $bang = $query_data['bang'];
  $rest = $query_data['rest'];

  $bang_data = find_bang($bang);
  if ($bang_data == null) {
    $bang_data = $default_bang;
    $rest = $q;
  }
  $url_template = $bang_data[$field_name];
  if ($url_template == null) {
    exit();
  } else {
    redirect($url_template, $rest);
  }
}

OpenSearch stuff

To present the search engine to a user, you need to present an HTML page with a <link> element in the <head>.

Determining the address of the actual page is a bit annoying:


function full_link(){
  $partial_uri = parse_url($_SERVER['REQUEST_URI'], PHP_URL_PATH);
  return (empty($_SERVER['HTTPS']) ? 'http' : 'https') . "://$_SERVER[HTTP_HOST]$partial_uri";
}
$actual_link = full_link();

But I need it to build the link element. I also threw in a tiny form for testing the search.


<html>
  <head>
    <link
  rel="search"
  type="application/opensearchdescription+xml"
  title="Better Search"
  href="<?php echo("$actual_link?xml=xml");?>" />
  </head>
  <body>
    <form>
      <input type="text" name="q"/>
    </form>
  </body>
</html>

The XML is pretty boring:


function xml() {
  global $actual_link;

  header('Content-Type: application/opensearchdescription+xml');
  echo('<?xml version="1.0" encoding="UTF-8"?>'."\n");
  echo('<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">'."\n");
  echo('  <ShortName>Better Search</ShortName>'."\n");
  echo('  <Description>Yet another search engine</Description>'."\n");
  echo('  <InputEncoding>UTF-8</InputEncoding>'."\n");
  echo('  <Url type="text/html" template="'.$actual_link.'?q={searchTerms}"/>'."\n");
  echo('  <Url type="application/x-suggestions+json" template="'.$actual_link.'?s=1&amp;q={searchTerms}"/>'."\n");
  echo('  <Url type="application/opensearchdescription+xml" rel="self" template="'.$actual_link.'?xml=xml" />'."\n");
  echo('</OpenSearchDescription>'."\n");
  exit();
}

Tying it all together

And lastly, whether to do the search, present the XML, or just show the HTML page:


function main() {
  parse_str($_SERVER['QUERY_STRING'], $_GET);
  $q = $_GET['q'];

  if ($q) {
    $field_name = $_GET['s'] ? 'suggestion_url' : 'search_url';
    full_monty($q, $field_name);
  }

  if ($_GET['xml']) {
    xml();
  }
}

main();
?>
<!--html goes here-->

Works nicely and responds quickly enough to be useful. I’m happy.

Tags: ,

Updated: