Last Updated:

Short, long and beautiful URLs

URLs, the Uniform Resource Locator, are fundamental to the Internet. These are addresses that are used to find web pages, or in particular "resources", since they don't have to be web pages, they can be anything from images, files, or even raw data. On most modern websites, URLs have come a long way since the days when they were illegible and have abandoned the technologies used, such as "/pages/show.php?page=15&tag=ruby".

Modern URLs are usually short, readable, and descriptive, such as "/pages/tagged/with/ruby". There has also been a significant increase in the use of short URLs such as Bit.ly and the like. Another type of URL that is often overlooked is the long URL. They are used to avoid remembering or easily playing URLs. For example, if all the pages of the site are public, and you want people to have access only to those URLs that were specifically sent to them.

In order to demonstrate these three different URL schemes, I'm going to use a very simple application, Sinatra & DataMapper, which creates notes.

You can see the demo of this app here: a fairly short and long
source code available on GitHub.

Note Model

Each note has a title and some content, as you can see from the code for the model:

class Note
  include DataMapper::Resource
  property :id, Serial
  property :title, String, :required => true
  property :content, Text
end

Beautiful URLs

A "cute" URL can be thought of as human-readable and descriptive. There are some small SEO benefits to using them, but the main reason is that they give your URLs a much more professional look and make them more memorable.

Creating a beautiful link for each note is easy by adding an additional property called "beautiful" to the model and creating it by default based on the heading you entered.

property :pretty, String, default: -> r,p { r.make_pretty }

This uses the proc object to call the following make_pretty method on the newly created annotation object and then save it to the database:

def make_pretty
  title.downcase.gsub(/W/,'-').squeeze('-').chomp('-')
end

This method takes the string used for the header and then combines the 4 string methods together to create a nice URL. Here's a breakdown of what each method does with the example title "Oops! It really hurts!

LowercaseConverts all letters to lowercase – oops! it really hurts!
gsub (/ W /, '-') : replaces all characters that are not letters or numbers with a hyphen – 'oops-it-really-hurt'
squeeze ('-') : replaces any duplicate hyphens with a single hyphen – "oh-so-painful"
chomp ('-') : removes any hyphens from the end that may look dirty – 'oh-so-painful'

Because this has been stored as a property of the Note class, the database may be queried to search for notes based on this property using first

get '/pretty/:url' do
  @note = Note.first(:pretty => params[:url])
  slim :show
end

Long URLs

A long URL for each note can be easily created by hashing some values unique to the note. You need an additional property called long that will use the proc method to call the make_long method and then store the resulting string in the database:

property :long, String, default: -> r,p { r.make_long }

This uses the digest library to hash the string created by combining the time of the note creation with its title and ID.

def make_long
  Digest::SHA1.hexdigest(Time.now.to_s + self.title + self.id.to_s)
end

The identifier is used to ensure that this string is unique, and that the timestamp will make random guesses more difficult. I decided to use the SHA1 library, which creates a string of 40 characters, but there are others like MD5, SHA2, and BCRYPT.

Notes that use long URLs can be found in much the same way as beautiful URLs:

get '/long/:url' do
  @note = Note.first(:long => params[:url])
  slim :show
end

Short URLs

The easiest way to create a short URL for each note is to simply use the id property of the note as the URL (for example, "/3" would be the URL for the note with ID 3). Unfortunately, this approach has at least two drawbacks: First, as the number of notes increases, the length of the URL will also increase – if you exceed a million notes, the URLs will become 7 or more digits long. Second, if you're just using an autoincrement identifier as the URL, then the user may want to change the value in the hope of finding another note that isn't meant for them (provided that password protection isn't set).

The first problem can be solved by changing the base. By changing the IDENTIFIER to the base number 36, you will significantly reduce the number of digits required. Base 36 numbers use all the digits 0-9 and all the letters az (lowercase letters only) to represent the numbers. For example, the number 1,000,000 in database 36 is lfls. Ruby has a neat built-in method for changing the base of a number – you just need to add the base you want to convert to as an argument to the to_s method 1000000.to_s(36) => "lfls" To go back, use the to_i"lfls".to_i(36) method => 1000000 As you can see, this reduced the 7-digit number to a 4-character line.

We still haven't solved the second problem – people are trying to guess other URLs. For example, if url '/lfls' points to a 1 millionth note, then I could easily find the following note by typing 'lflt', which is the basic representation of 36000001. To mask these short URLs, we first need to create a random 1-digit number that will be stored in the database as a salt:

property :salt, String, default: -> r,p { (1+rand(8)).to_s }

You can use the following method to create a very random short URL:

def short
  id.to_s + (salt.to_s).reverse.to_i.to_s(36)
end

This takes the identifier, changes it to a string, and then combines the salt value to the end, and then changes it to base 36. This results in notes with sequential identifiers having very different looking short URLs. Take example from the top 1000000 and 1000001:

(1000000.to_s + (1+rand(8)).to_s).reverse.to_i.to_s(36) => "zq0ap"
(1000001.to_s + (1+rand(8)).to_s).reverse.to_i.to_s(36) => "1c8401"

You're sacrificing a bit of length here, as the salt makes the resulting URL longer, but I feel like it's worth it to get short URLs that seem random. You can make URLs even shorter using the numbers 62 (they also use all the capital letters AZ), but you'll have to use an external library like Base 62 gem.

Short URLs don't actually have to be stored in the database because they have a simple inverse function that can be applied to a short string of the URL to display it back into the original id of the note.

url.to_i(36).to_s.reverse.chop

This converts the string back to the integer 10 and then back into the string, flips it over, and cuts off the last digit (which is a random salt value). You can then use the get method in the DataMapper to find the note by its ID:

get '/short/:url' do
  @note = Note.get params[:url].to_i(36).to_s.reverse.chop
  slim :show
end

I hope you found this helpful. Leave a comment about how you can use some of these techniques or other ways of writing a URL.