Nick's Blog

Personal opinions. Aggressively stated.

Working “As Smoothly As It Should:” Technical Challenges At Healthcare.gov

Screen Shot 2013-12-01 at 08.33.55Earlier today the Obama administration held a press conference and update on the tortuous rollout of healthcare.gov. It released a previously embargoed document entitled, HealthCare.gov Progress and Performance Report which is available here.

Before I start, let me state some things about information security to date.

Frustration With Information Security

I have been told a few times by people that the site does not store data that is required to be protected under the Health Insurance Portability and Accountability Act of 1996 (HIPAA), so please allow me to explain a little my concern with the lack of information security I have seen so far.

This nation passed a law effectively requiring me to entrust to healthcare.gov the personal details of myself and my family. The details that healthcare.gov requires of me are sufficient, by definition, to authenticate to systems guarding healthcare information. This means that, whether healthcare.gov is storing information that HIPAA defines as “protected data,” the fact remains that healthcare.gov is storing authentication data sufficient to obtain HIPAA protected data from other sites.

By any other industry definition, this would give healthcare.gov a responsibility to protect all the data to a level at or above that of any other HIPAA provider. And it gives me the right as a citizen to demand of my government proof that the government’s website is fundamentally capable of providing basic safeguards of the data it compels me to provide. Which I am doing here.

The Report

Here are my thoughts, starting with the fact that nowhere in this report is the word, “security.”

I believe that, on its face, this report is a marketing document; a political talking-points memo. Speaking purely technically, and without politics, the HealthCare.gov Progress and Performance Report is a woefully inadequate articulation of technical progress. 

What follows are some actual, non-rhetorical questions that should be answered for anyone to take seriously this report.

Some Early Technical Questions

“The team has knocked more than 400 bug fixes and software improvements off the punch list.”

Screen Shot 2013-12-01 at 21.09.03This is a raw number without context; like telling a cancer patient we have “removed 875 cancer cells,” it tells only a part of the story, but implies progress. How many bugs and software problems are on the “punch list” in total? Put another way, what percentage of the “punch list” did these 400 fixes comprise? Who made and prioritized/triaged the “punch list”? Is this a functional bug fix list or a list of software vulnerabilities? Have any third parties like White Hat penetration tested the application yet, and if they did, did the 400 bug fixes include any security patches? What was the average priority/severity of each of the “knocked off” fixes? If you look at the graphic provided by the report, you see that we only have the raw number, and the graph makes it appear that nearly all software bugs have been fixed. This is misleading, to say the least, but it is also non-responsive to the questions everyone has been asking: how bad is this?

“Per page system timeouts or failures…” is the metric given for “error rates”

Is this front-end, page-load or BACK END error? Is the system still sending inaccurate data to insurers? What percentage of 834-forms had errors in October, compared to the percentage of 834-form errors now? Are 834-form errors considered as part of the “Error rate” improvement metric given in this report? More specifically: does this “error rate” measure errors in front end availability and performance of pages of the application, or errors kicked off by the application itself, which would include back-end processing, authentication, database, and other errors?

“Uptime is consistently surpassing 90%*”

Screen Shot 2013-12-01 at 21.11.44This section of the report deals with “System Stability”, and the asterisk points to a note saying “excluding scheduled maintenance,” as you can see here.

Does this “System Stability” metric of uptime include the unscheduled eleven hour maintenance period during which the site was down yesterday?

If it does not, then the uptime is significantly below 90%.

“Dedicated team focused on site monitoring and instant incident response”

Screen Shot 2013-12-01 at 08.33.55There is a lot wrong with this statement; the graphic is not explained and is meant to imply a team of professionals viewing constantly dashboards full of performance metrics and success that are more detailed than are proffered in this report.

Can we please see these screens and all of the data they aggregate and analyse in a resolution that allows viewing?

Perhaps most important in this age of targeted data theft: in the minds of the “dedicated team”, what comprises an “Incident”?

Conclusion

I will repeat the observation I have been shouting from the rooftops for more than a month: even if today’s news that the site was more available to people – it is interesting that a goal of 50,000 concurrent users is what passes for “available” today, at a time when large web applications routinely provide millions of concurrent connections to users – the technical team has nothing to address the fundamental confidentiality and integrity issues that have challenged and continue to plague healthcare.gov.

 

Leave a Reply