Advanced Active Record: Optimizing Performance and Robustness with Locking, UUIDs, Fulltext Search, Database Views, and Geospatial Data

Advanced Active Record: Optimizing Performance and Robustness with Locking, UUIDs, Fulltext Search, Database Views, and Geospatial Data

Active record advanced concepts

Now that you know how to avoid trouble with Active Record, it is time to take the next step. There are some more advanced database concepts that Rails developers should know (or at least be aware of) to, even more, improve the performance of the application and make its features more robust.

This article is an introduction to those concepts, so I will only barely touch the surface, and each topic deserves a separate article with a deeper dive into the code.

I tested all the presented examples against the PostgreSQL database, and not all concepts may be available on other database engines, or their implementations may vary. All my notes are based on years of hands-on experience at iRonin.IT - a top software development company, where we provide custom software development and IT staff augmentation services for a wide array of technologies.

Locking of the records

Imagine that you created a Rails application where managers can cooperate by managing important information about their company. There might be a case where two managers will attempt to edit the same information simultaneously.

If they save the changes nearly simultaneously, one of them won’t see the changes he introduced. If you want to avoid this or similar problems, you can introduce the locking of the records.

There are two types of locking: optimistic (positive) and pessimistic (negative). Let’s take a look at them.

Optimistic locking

In optimistic locking, you assume another person won’t update the same record simultaneously. If this is going to happen, the application should raise an error.

This behaviour is controlled with a column called lock_version. Add this column to your model, and Rails will automatically apply optimistic locking. This is what it looks like in practice when you have the lock_version column in your table:

company1 = Company.find(1)
company2 = Company.find(1)

company1.title = "Great company"
company1.save # => true

company2.title = "Awesome company"
company2.save # Attempted to update a stale object: Job. (ActiveRecord::StaleObjectError)

The second update will be rollbacked as someone has updated the record in the meantime. In this example, we process both records in the same console, but in real applications, it will control concurrent processes.

We name it optimistic locking because we allow reading the record, and we assume there will be no conflicts,s so there is no need to lock exclusively and disallow other people from dealing with this record.

Pessimistic locking

In the pessimistic mode, you eliminate the case where some other process will update or even fetch the record when the transaction is not committed. Let’s consider this case:

company = Company.find(1)

company.with_lock do
  sleep(10)
  company.update(title: 'Awesome')
end

The transaction will just hang for 10 seconds. You can open another terminal window and try to update the record:

company = Company.find(1)
company.update(title: 'Great')

You won’t be able to do this. The second transaction will hang until the locked one is finished. After the first transaction is finished, the second will finish, and both users will see the title as "Great".

If you would avoid fetching the record if the locked record is updated, you would have to replace Company.find(1) with Company.lock.find(1). There are a few ways you can lock records with Rails, but since this article is just an introduction, I won’t go into details about it.

Integer vs. UUID as the primary key

In a typical Rails application, integers are used as the primary keys for your tables. However, sometimes you may need to use UUID instead, especially if you plan to expose the primary key to the users of the application or external systems.

UUID stands for unique universal identifier, and it’s a unique string in a similar format to 28059680-d3d5-11ed-afa1-0242ac120002. Some use cases for UUIDs include:

  • Distributed databases - when data is stored on multiple servers, you can avoid primary key conflicts by using UUID instead.

  • Large-scale systems - in large systems, the generation of primary keys may be problematic, and when using UUID, you can make it more effortless.

  • Security and private - when you don’t want someone to guess the primary id of other records.

Using UUID brings advantages as well as disadvantages, but in this article, I want just to show you that it exists and how you can use it within the Rails application.

Configuration

There are two steps. One is to enable the pg crypto extension, and the second is to let Rails know that we would like to use uuid instead of id when generating migrations for new tables. In PostgreSQL 13 and later, you no longer need pgcrypto extension so you can skip the migration step.

Here is the migration:

class EnablePgCrypto < ActiveRecord::Migration[7.0]
 def change
   enable_extension 'pgcrypto'
 end
end

Now, create config/initializers/generators.rb file with the following contents:

Rails.application.config.generators do |g|
  g.orm :active_record, primary_key_type: :uuid
end

Usage

Let’s check if the configuration is working as expected:

rails g model Article title:string
rails db:migrate

Create a record and investigate the value of the id attribute:

article = Article.create!(title: 'Active Record good practices')
article.id 
# => "f72ca708-c9db-4f8f-b9e1-2706e2b02467"

Fulltext search

In PostgreSQL, the full-text search feature allows us to efficiently search with a given term through one or more related columns in the database. What’s more, there is an excellent gem for that!

Like in the case of other topics, this topic is also very broad, and for sure, there is enough information to write a series of articles dedicated only to the case of full-text search. This time I will just scratch the surface with the pg_search gem to arouse your curiosity.

Is like expression not enough?

No. If you want to perform a search using multiple words on multiple columns and eliminate spelling mistakes, the like expression won’t be enough. A simple example: an application with job offers and a search bar. User types “Rails” and you would like to find this keyword in the category, title, and description of the job offer.

Configuration

There are two steps: add a gem to the application and configuration in the model class for which you would like to use full-text search. The gem part:

bundle add pg_search

We can also generate some model and put data into it:

rails g model Job title:string description:text
rails db:migrate

Now, we can put the configuration inside the model:

class Job < ApplicationRecord
 include PgSearch::Model
 pg_search_scope :search_job,
                 against: { title: 'A', description: 'B' },
                 using: { tsearch: { dictionary: 'english' } }
end

With the above configuration, we will search using title and description columns (but the title has priority) and normalize the contents of those columns using the English dictionary. It may sound a little bit mysterious, but if you want to know more right now, there is an excellent article from which I took the configuration example.

Usage

The first argument of the pg_search_scope method is the class method we can use to perform the search:

Job.search_job("Rails developer")

The above call will generate some complex query, and if you have matching records in the database, it will return an ordinary Active Record collection. We can improve the performance of the query, but it’s beyond the article’s scope.

Database views

Simply put, a database view results from a SQL query stored inside the database management system. There are three types of such views: read-only, updatable, and materialized. Since we are touching on a broader topic, I will go with an example of the materialized view provided by the well-known gem called scenic.

Database views are used to get better performance, isolate a level of abstraction, and don’t repeat the same more or less complex SQL query definitions in the application’s code.

The idea behind the database view

Imagine that in your application, there is a table called activities. It contains millions of records related to the actions performed in the system. Often, you want to generate detailed reports for the specific actions performed in the specified period: week, month, or even year.

To generate the report data, you perform long and complex SQL query containing many conditions and joins. Wouldn’t it be easier if you could just use a model called FinancialDepartmentReport and pass simple where with the time frames? It’s possible with the database views. Let’s configure the scenic gem in our Rails application, and I will demonstrate the materialized view in a much simpler and quicker case.

Configuration

Start with adding the scenic gem to your application:

bundle add scenic

For the demonstration purpose, I will also generate the Person model with the column for storing the person’s name, location, and age:

rails g model Person name:string location:string age:integer
rails db:migrate

And generate some test data that we can operate on:

Person.create!(name: "John Doe", age: 30, location: "United States")
Person.create!(name: "Tina Doe", age: 30, location: "Germany")
Person.create!(name: "Tim Doe", age: 15, location: "Germany")
Person.create!(name: "Alex Doe", age: 14, location: "United States")

Our goal is to create a database view that will contain only adult people from the United States. Instead of doing Person.where(“location = ? AND age > 17”, “United States”) I would like to be able to call UnitedStatesAdult.all.

View creation

The scenic gem provides commands similar to the ones we use to create models:

rails generate scenic:model united_states_adult --materialized

The above command will generate the following files:

  • View file - SQL file where you need to put the SQL query that is going to be used to pull the information into view

  • Migration file - it will create the view based on the SQL query you provided in the view file

  • Model - a class that you can use as an ordinary model to query the data from the view

In our case, the SQL query for the view is simple:

SELECT "people".* FROM "people" WHERE (location = 'United States' AND age > 17)

After you save the file, you can run migrations to create the view:

rails db:migrate

Open the rails console and test your new view:

UnitedStatesAdult.pluck(:name)
# => ["John Doe"]

View update

The view is not updated automatically when you will change the information in the people table. In the UnitedStatesAdult, there is a method called referesh - you need to call it to regenerate the view so it contains information that is up to date.

There is a lot more

The code that I presented above is just a simple example of one of the database view types - materialized. There is a lot more to explore in terms of the database views, but it’s a topic for the next article.

Geospatial data

If your application is dealing with information regarding locations, then we can say that you are dealing with geospatial data. If we want to perform more advanced calculations on a database level, we need to familiarize ourselves with GIS.

GIS is a shortcut for Geographic Information System. The good news is that in Rails, we have a gem called activerecord-postgis-adapter which provides support for GIS in the PostgreSQL database. Let’s quickly configure it and demonstrate how powerful it is.

Configuration

First, you must install the PostGIS extension in your system - https://postgis.net/install/. I decided to set up everything with Docker and pull the dedicated PostgreSQL image to save some time. I followed a few steps to make everything work as expected:

  1. I created Dockerfile for a newly created Rails project

  2. I created a docker-compose.yml file to set up the database and the server

  3. I updated Gemfile with the activerecord-postgis-adapter gem

  4. I updated the config/database.yml file to reflect that we would like to use a custom adapter

Here is my Dockerfile:

FROM ruby:3.2.1-alpine
COPY Gemfile* /app/
WORKDIR /app
RUN apk update && apk add build-base git libpq-dev nodejs postgresql-client postgresql
RUN gem install bundler -v 2.3.17
RUN bundle install
COPY . .
EXPOSE 3000
CMD rm -f /tmp/server.pid && rails server -b 0.0.0.0 -P /tmp/server.pid

And corresponding docker-compose.yml:

version: '3.7'
services:
 db:
   image: mdillon/postgis:latest
   volumes:
     - 'postgres:/var/lib/postgresql/data'
   environment:
     POSTGRES_HOST_AUTH_METHOD: trust
 web:
   build:
     context: .
     dockerfile: Dockerfile
   tty: true
   ports:
     - "3000:3000"
   environment:
     DATABASE_HOST: db
     PGUSER: postgres
   depends_on:
     - db
   volumes:
     - '.:/app'

volumes:
 postgres:

The config/database.yml file:

default: &default
 adapter: postgis
 encoding: unicode
 pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
 host: <%= ENV.fetch("DATABASE_HOST") { 'localhost' } %>

development:
 <<: *default
 database: locator_development

Notice the adapter value; without this changed adapter, the application won’t work the way we expect. Now, you can run docker-compose up –build to install and run all components of the application.

Data preparation

We need to fill our database with information about locations so I can demonstrate how we can deal with the geospatial data. I decided to create Airport and City model. Each model contains name field and lonlat field for storing coordinates.

The migrations are the following:

enable_extension 'postgis'

create_table :cities do |t|
 t.string :name
 t.st_point :lonlat, geographic: true

 t.timestamps
end

create_table :airports do |t|
 t.string :name
 t.st_point :lonlat, geographic: true

 t.timestamps
end

Notice the st_point column type. It is one of the custom column types added by the gem we added to handle the PostGIS extension in our database. It allows us to store, query, and manage coordinates.

The next step is to insert some data that we can query later (I used airports and cities in Poland):

Airport.create!(name: 'Lech Wałęsa Airport', lonlat: "POINT (18.459664828 54.373165174)")
Airport.create!(name: 'Katowice Airport', lonlat: "POINT (19.074666368 50.471164782)")
City.create!(name: 'Warsaw', lonlat: "POINT (21.017532 52.237049)")
City.create!(name: 'Olsztyn', lonlat: "POINT (20.490189 53.770226)")
City.create!(name: 'Gdańsk', lonlat: "POINT (18.638306 54.372158)")

Querying the data

There is a lot to explore, but I will just barely touch the surface. Let’s say that we want to check which airport is the closest to Warsaw city:

city = City.find_by(name: 'Warsaw')
airport_table = Airport.arel_table
closest_airport = Airport.order(airport_table[:lonlat].st_distance(city.lonlat)).first
puts closest_airport.name
# => "Katowice Airport"

Thanks to jrochkind for pointing out that pgcrypto extension is not needed for UUID support in Postgres 13 and later.

Didn't get enough of Ruby?

Check out our free books about Ruby to level up your skills and become a better software developer.