Location Search Without Elasticsearch: Combining Tsvector and Postgis

Search Without Elasticsearch: Combining Tsvector and Postgis

Geo-Ranked Full-Text Search with PostgreSQL tsvector and PostGIS

When users search for a location by name, they expect results that are both textually relevant and geographically close. We built a search system that combines PostgreSQL’s full-text search with PostGIS distance ranking — no Elasticsearch required.

Search Table

Rather than querying our main locations table with all of it’s metadata (millions of rows, dozens of columns), we maintain a dedicated location_searches table optimized for this single purpose:

  CREATE TABLE location_searches (
    location_id  integer NOT NULL,
    uuid         uuid NOT NULL,
    name         text,
    address      text,
    city         text,
    coordinates  geography(Point, 4326),
    search_tsvector tsvector NOT NULL
  );

The search_tsvector column stores pre-computed text search tokens. A database trigger or activerecord keeps it in sync whenever a location is inserted or updated — combining normalized name, address, city, postal code, and brand tokens into a single vector:

  combined_tsvector :=
    to_tsvector('simple', name_text || ' ' || address_text || ' ' || location_text)
    || COALESCE(brand_tsvector, ''::tsvector);

Indexing

Two indexes make this fast:

  CREATE INDEX ON location_searches USING gin (search_tsvector);
  CREATE INDEX ON location_searches USING gist (coordinates);

The GIN index accelerates the @@ text match operator. The GiST index accelerates ST_Distance calculations against the geography column.

Query: Text Relevance + Distance

The interesting part is the ranking formula. User input is converted to a prefix tsquery — searching for “Star” becomes Star:*, matching “Starbucks”, “Starlight Diner”, etc.

When a GeoJSON boundary is provided, we compute the centroid and blend text relevance with geographic proximity:

  SELECT uuid, name, address, city,
         ST_X(coordinates::geometry) AS longitude,
         ST_Y(coordinates::geometry) AS latitude,
         (
           ts_rank_cd(search_tsvector, to_tsquery('simple', :tsquery))
           + (1.0 / (1.0 + ST_Distance(
               coordinates,
               ST_SetSRID(ST_GeomFromGeoJSON(:centroid_geojson), 4326)::geography
             )))
         ) AS rank
  FROM location_searches
  WHERE search_tsvector @@ to_tsquery('simple', :tsquery)
  ORDER BY rank DESC
  LIMIT :limit

The distance component uses an inverse decay formula: 1 / (1 + distance_in_meters). A location 0 meters away scores a full 1.0 boost; one 1km away scores ~0.001. This naturally blends with ts_rank_cd scores, which also fall in a small positive range, so neither signal dominates.

Why This Works Well

  • Pre-computed tsvectors mean we never parse text at query time.
  • Trigger-based or activerecord sync means the search table is always current.
  • A single SQL query handles both text matching and geo-ranking, keeping the application layer simple.

The key tradeoff: we maintain a denormalized table. But for search, that’s a decent tradeoff. The GIN index on a focused table with only the columns we need is dramatically faster than searching the main table, especially for short prefix queries that match thousands of lexemes.