Free Address Normalization in PHP

Photo by Joel Moysuh on Unsplash

Free Address Normalization in PHP

At work we had an interesting problem come up. We allow our clients to import their contacts into our system from numerous ways like 3rd part integrations, API, and user uploads. As more and more data is imported from all these different data sources we ran into a problem of being to figure out which addresses were duplicates.

The Problem

The following three addresses would be added to our system for the same contact. But, our system was unaware that the following addresses would be considered exactly the same:

1331 E Hashnode Ln

1331 East Hashnode Ln

1331 E Hashnode Lane

1331 East Hasnode Lane

All of the above addresses are correct and if we were to send snail mail, they'd arrive at the correct location.

The Solution

Normalize the address data. Through lots of Googling, I stumbled upon this little repo zerodahero/address-normalization with a little less than 6k installs currently.

We've run it in production now for a few months and we like the results. The docs on the repo are great and you should be easily able to add it to your existing PHP stack.

For our use case, we don't change the data that our clients upload/send us. Instead, we use the package's hash feature ($address1->getFullHash()) and store that as a new column in our address table. If during import we see that the hash already exists we then merge any new data with the existing contact.

Thanks, for reading, please let me know if you found this helpful or have questions over on Twitter: twitter.com/guywarner801

Have a better free solution? Let me know in the comments.

Did you find this article valuable?

Support Guy Warner by becoming a sponsor. Any amount is appreciated!