Phone Formatting

In many of my Identity Management (IdM) projects I am facing predicament of "dirty data". The term of "dirty data" is used to describe incorrect or misleading data residing within a data-source.

Self-service data-sources (such as web-portals, phone directories, etc) are the biggest producers of inconsistently entered data, which is understandable in the scenario when any user is allowed to modify his/her data manually with little guidelines and data verification(s).

There are many deferent types of user-provided data Identity Management professional will face; one of the most common data types that is "outsourced" for entering to the end-user is a user’s phone number(s). In the end all synchronized data sources could consume that data, which could lead to difficulties in processing, if/when application(s) expecting more consistent data format.

Here in North America we are lucky to have uniformed phone numbering plan (from programmer stand-point) , known as North American Numbering Plan (NANP); NANP makes parsing of the phone number relatively easy; This article covers only North American phone number format and does not attempt to parse any other formats for any other phone systems. Direct application of this custom format provider to other types of phone numbers could result in unpredictable results. However you can extend this code to process other types of the phone numbers by adding methods that would recognize formats of the phone numbers specific to your local phone system. Good example would be French phone system, which is persistent in its numbering rules and therefore can be quantified by format provider relatively easy.

In the library that I’ve posted on Code Project you can see how you can brush-up the phone number that was not stored in uniformed manner.

You can find the Lost and Found Identity – North American Phone Formatter here

