Designing multi-site RADIUS systems

A practical guide

Some organizations and ISPs can use a central RADIUS service for all of their RADIUS needs. This configuration is possible when there are a small number of users, or system load is low. However, when there are a large number of users spread across a wide geographic region, it may be beneficial to use a multi-site approach. As with all solutions, this approach has benefits and costs.

Allocate more than one primary directory service instance

It is already common practice to set up a primary directory service instance at the main data center for storing usernames and passwords. Additional data centers will usually have a secondary instance of this identity store that is simply a “clone” of the primary. Any changes to user information are made on the primary instance, and then copied to all the secondary ones.

Deploying a single primary directory service which updates secondary nodes is often sufficient for smaller ISPs that have only one or two datacenters.

However, given how catastrophic it can be for an ISP if the primary directory service instance goes offline, we recommend adding an additional primary database to provide more resilience and responsiveness in the network. Particularly when:

Data centers are distributed over a wide geographic area, increasing the lag time to update the secondary database
There are more than three or four data centers.
The underlying telecom network infrastructure has known reliability issues. While this is a more obvious major consideration for our clients in developing countries, wealthier countries are also vulnerable to network failures due to extreme weather events or freak power outages.

The diagram above shows how multiple primary instances work in a network design for four geographically dispersed data centers.

There are two sites that maintain a primary instance of the directory service (Site B and Site C).
The two primary instances regularly sync with each other.
All four sites also maintain a secondary instance of the directory service.
Both of the primary instances regularly update each of the secondary instances. This communication is one-way, with the content only coming from the primary nodes.

With this design, the network has additional levels of redundancy and dramatically decreases the risk of services outages. In this scenario, a service outage would only occur if both Site B and C were inaccessible at the same time.

Bear in mind that not all databases support the deployment of multiple primary instances. We often recommend Open LDAP as a directory service for ISPs, in part because it is one of the few databases with this capability.

Although this level of redundancy may be considered over zealous, our experience has repeatedly demonstrated the importance of planning for the worst case-scenario. Particularly when it comes to critical network infrastructure.

Large ISPs provide critical network access to tens of millions of people. It is essential to design the network infrastructure to be responsive to users, while planning for rare, but inevitable unexpected outages at any location.

Need more help?

These design principles are based on best practices developed over the last twenty years at NetworkRADIUS. We have seen pretty much every variation of FreeRADIUS deployments out there. If you have a complex network environment and need expertise from the people who wrote FreeRADIUS, contact us for a consultation.

More about ISP RADIUS design best practices:

ISP RADIUS design: a practical guide

Preventing fraudulent logins across multiple sites

Separating RADIUS Authentication and Accounting functionality

Published 2021-02-10 12:00:00 +0000
Categories: articles