Handling Phone Numbers: Best Practices

Originally posted on the Mojo Lingo blog.

When building real-time and telephony communication applications, you will inevitably need to store phone numbers. Whether it's input you get from Freeswitch, Asterisk, or via an API like Tropo or Twilio, phone numbers can be tricky to handle, parse, verify, store, and display in your application.

Why Are Phone Numbers So Hard?

Phone numbers are very difficult to verify as their format can be dramatically different for various countries. Length, allowed starting numbers, reserved blocks, short codes and more make it very difficult to parse and verify a number is valid. When you receive a phone number input, does the number include the country code, an international dialing prefix, a national dialing prefix, an extension number, a special code like 411 or 911, or a special carrier command like *69 or 1157.

Just displaying phone numbers from around the world can be tricky as the groupings of numbers is different, such as in the US: (213) 555-1234 or the UK: (0)20 1234 5678. In addition, some countries have multiple formats! In Spain you can write a number like: 123 456 789 or 123 45 67 89.

Even the "country code" is mislabeled as 20 countries around North America share the same country code (thanks to the North American Numbering Plan).

How are we supposed to handle, verify, query, and work with this diverse pool of numbers that conforms to very few rules?

The E.164 Standard Format

You've probably run into a similar issue when working with date and timezones in your career as a developer. Date input can be just as varied as the phone number system and timezones add a unique wrinkle when outputing and comparing dates. This has (arguably) been solved with a standardized format ISO 8601, which unambiguously organizes dates and times with all the necessary localization information in a easily human readable and machine parseable format.

E.164 does that for phone numbers. It defines a simple format for unambiguously storing phone numbers in an easily readable string. The string starts with a + sign, followed by the country code and the "subscriber" number which is the phone number without any context prefixes such as local dialing codes, international dialing codes or formatting.

Numbers stored as E.164 can easily be parsed, formatted and displayed in the appropriate context... since the context of a phone number can greatly affect its format. So, with a UK number stored as +442012345678 we can easily display it in the appropriate format for the various contexts:

  • +44 20 1234 5678 - UK International format.
  • (0)20 1234 4567 - UK National format.
  • 011 44 20 1234 5678 - Dialing from US to UK.
  • 020 1234 5678 - Dialing locally within the UK.

E.164 stores the important parts of the phone number that never change in an easily parseable string that allows us to then format the number depending on the context which we are displaying it.

Now that you want to store your numbers as E.164, how do you parse and format them in your application?

Google's libphonenumber

There are lots of libraries out there that parse and format numbers into E.164, but I think that Google's open source libphonenumber is the best. Google's experience with international number support on their Android platform exposes them to more complete and accurate list of phone numbers around the globe.

With libphonenumber you can parse, verify, and format phone number inputs quite easily, do as you type formatting and even glean extra information about the number, like whether it was a mobile or landline or what state or province it was from.

libphonenumber in its basic form consists of a set of rules and regular expressions in an XML file for breaking down and parsing a number. Google provides a Java, Javascript, and C++ version of the lib, but people have ported it to other languages like Ruby, PHP, and Python.

In addition, it can provide offline reverse geocoding and map numbers to specific carriers if the data is available.

Other Special Cases

E.164 describes a format for internationally routeable numbers. Numbers that are reachable from many countries. Some special numbers do not meet this criteria like nationally specific numbers such as 911. Special numbers, especially emergency numbers like 911 or 112 require specific and often regulated handling, specific to your country. If you have to deal with these numbers, ensure you are meeting any required regulations and handle them as special cases. They are not formattable as E.164 numbers.

Extensions are another common piece of data when storing and collecting phone numbers. Think of extensions as extra information to send once you are connected. Extensions are not dialed when connecting to a phone number but are sent as extra instructions to the end system after you've connected to further direct your call... kind of like telephone NAT. They are not part of an E.164 number but are easily stored in a separate field and appended to any format for creating dialing strings or in the view.

TL;DR

  • Parse and store all your phone numbers as E.164. It is easy to compare and unambiguous to parse.
  • Use a library like libphonenumber to parse and format a phone number for output.
  • Ensure you handle emergency numbers like 911 and 112 as special cases and make sure you are meeting any of your country's regulations.
  • Extensions are not a part of a phone number but something to send after you've connected. They should be input and stored separately.