The crash and instability of the HealthCare.gov site at its rollout in 2013 was the nightmare scenario that keeps every IT professional up at night. While there was not a single, smoking-gun issue (it was a combination of many commonplace IT issues), there did prove to be a key solution: Application Performance Monitoring (APM).
As has been widely reported, New Relic was brought in to provide APM support in the HealthCare.gov war room. The software was able to send data to New Relic’s own servers, which then generated analytics reports on issues such as the amount of time it takes web pages to load, database errors and database response times. The information contained in the reports was presented as a color-coded dashboard accessible via the web. These reports helped baseline user experience and pinpointed the key areas of concern.
When the site was stabilized and users were able to sign up for healthcare coverage by the mandated deadline, the war-room team breathed a sigh of relief; but the work was not over. The lessons learned and technology employed needed to be applied throughout the Department of Health and Human Services (HHS) and the government as a whole.
CMS and HealthCare.gov
Following the launch, the Centers for Medicare and Medicaid Services (CMS) took over the day-to-day running of the HealthCare.gov site along with existing sites that include Medicare.gov, MyMedicare.gov, Medicaid.gov and InsureKidsNow.gov. This gave the office a combined active user base of more than 50 million. The team knew they needed deeper insights into the performance and usage of its web environment, but they had no way of tracking performance in real time — they only found out about site issues when they got a call from their contact center.
Beyond simply monitoring, there was a process in place to move all of the sites onto a single platform that could be more easily managed. This called for performance visibility all the way from the back end of the system on through to the consumer’s view, all through a single, unified portal.
The New Relic software-analytics platform they deployed provided dashboards and alerts to transaction traces and thread profilers that established an environment where CMS was proactively able to identify problems in the code base and prioritize fixes. This confidence in visibility allowed CMS to become comfortable with a more agile approach to development; perfection no longer had to be the goal as they could monitor and quickly mitigate issues. With this new approach, the CMS team has been able to meet expectations of site functionality and stability despite huge increases in their online environment.
Transforming the Future of CMS
In the past four years, the workload for the web and new media group at the CMS Office of Communications has more than doubled in terms of the number of websites, programs and active users being supported. At the same time, the web and new media group team has only needed to grow moderately, expanding by roughly 25 percent. The group has also been able to reduce the number of tools it uses, which in turn, lowers costs and frees up resources.
In addition, CMS has achieved the following goals:
- Decreased response times to performance issues, including improving mean time to resolution by at least 75 percent.
- Improved from a quarterly to biweekly update release cycle.
- Increased development speed by 80 percent.
- Significantly advanced the end-user experience for citizens.
For more information on how New Relic is helping CMS manage their online properties and functions, download the full case study here.