Placeholder Image

Subtitles section Play video

  • - Some of the government's most important websites

  • are crashing when we need them the most.

  • More than 22 million people have filed

  • for unemployment in the last month,

  • an unprecedented number driven by

  • the global coronavirus outbreak.

  • Now Congress has put aside an extra 250 billion dollars

  • to handle the new applicants,

  • but as people go to the state level systems to file,

  • a lot of those websites are just timing out.

  • - Some say that applying for unemployment benefits

  • is nearly impossible.

  • - The state computer system is having some trouble.

  • - They need to fix the website.

  • - This isn't how the internet usual works.

  • Services like Netflix and Zoom have seen

  • a huge surge in traffic too, but aside from a few hiccups,

  • you'd never know the difference.

  • Most web engineers plan to be able to handle

  • ten times the regular traffic without breaking a sweat.

  • But government systems don't work that way.

  • And it's surprisingly hard to shift them over.

  • A lot of that is because of the backend programming,

  • most of which is written in a coding language called COBOL

  • that dates all the way back to the 50s.

  • But to understand why they're still using COBOL

  • and why it's such a problem, you have to see

  • how these sites were originally built.

  • And most importantly, you have to look at the big picture.

  • The story of COBOL starts in 1959,

  • way before personal computers or the internet.

  • A corporation or university might have a computer network,

  • but you were really only going to run programs

  • within your specific system.

  • So each network developed slightly different rules

  • and it became really hard to transfer programs or data

  • from one network to another.

  • So a group of engineers including legendary

  • Navy programmer Grace Hopper, started working on

  • a common programming language that could

  • bridge those networks and be the main language

  • for businesses going forward.

  • They called it the Common Business

  • Oriented Language, or COBOL.

  • By the 70s, COBOL was the standard.

  • If you were managing a huge database system,

  • you wrote all your code in COBOL.

  • And that dominance is a big part of why

  • it's still in use today.

  • This is by no means a dead language.

  • It's something that certainly millions,

  • possibly billions of financial transactions

  • rely on COBOL on a daily basis.

  • - If you want to switch off COBOL,

  • you basically have to start from scratch.

  • So a lot of people just stuck with it.

  • It also locks you into a particular kind

  • of server architecture.

  • Running COBOL code meant you were running everything

  • off a handful of servers on your internal network.

  • When it was developed, that was the only option.

  • And even later there were real advantages to it.

  • You could teach your server special tricks

  • for handling your specific kind of data.

  • And deploy programs to the whole network

  • without having to install them on every specific machine.

  • But it was also putting a lot of weight

  • on that one server.

  • If that server goes down, the whole network goes down.

  • And if you try to bring in a replacement,

  • you'll need to teach it all those special tricks.

  • But when the internet happened,

  • you suddenly had to worry about keeping your service running

  • in the face of huge shifts in usage

  • and constant code updates.

  • That meant treating your servers

  • in a completely different way.

  • As engineers started to put it,

  • they're not pets anymore, now they're cattle.

  • When you've got 50 servers running,

  • it doesn't matter if one of them goes down.

  • You just bring in another one

  • and you make sure they're all so dumb

  • and interchangeable that you can cycle them

  • in and out without anyone noticing.

  • You don't train them, you just herd them.

  • And because these are global web services,

  • that also means you can distribute your herd

  • all around the world, scaling up or down

  • depending on how many people are visiting

  • the site that morning.

  • With cloud hosts like Amazon Web Services

  • or Microsoft Azure, you don't even need

  • to buy a whole server.

  • You can just rent one percent of a server

  • for a few hours, just to make it through

  • that morning's spike in demand.

  • Name any online service that's launched

  • in the last 20 years.

  • They basically all work on the cattle model.

  • That means lots of basically disposable servers

  • cycling in and out.

  • But a lot of these state unemployment systems

  • have been running continuously for 40 years,

  • processing thousands of applications every week,

  • all on COBOL.

  • They never switched over to disposable servers.

  • Which makes it hard to process the kind of traffic surge

  • that YouTube of Netflix would take in stride.

  • It's not that COBOL is a bad programming language,

  • but it locks you into a bad way of managing your network.

  • It forces you to treat your servers like pets.

  • And because switching off of COBOL is so much work,

  • a lot of government systems have never been able

  • to make the leap to the cattle model.

  • - It's incredibly difficult to even find workers

  • who know COBOL.

  • The language is old and some of the people

  • still fluent in it are even older,

  • with many approaching retirement age.

  • This has become a recipe for disaster

  • in states that still operate under COBOL.

  • Governors like New Jersey's Phil Murphy

  • have called for programmers to come out of retirement

  • to help maintain their overwhelmed systems.

  • - You can't really move a COBOL program to the AWS cloud.

  • So it just sits there getting older

  • and a little harder to maintain each year.

  • Programmers called this technical debt.

  • And if you aren't spending money on upgrades every year,

  • it piles up fast.

  • - For more than 10 years, the federal government

  • has been pressuring state Medicaid programs

  • to update their aging systems.

  • They've been handing them large sums of money

  • to modernize, but it's still an enormous lift.

  • - Before these folks retired, many of them

  • had been fired, they'd been laid off.

  • And then they'd actually been brought back in

  • in crisis moments to fix and upgrade the COBOL systems,

  • which ideally they should have just been kept on

  • to maintain the entire time.

  • - The real problem is, we just haven't been

  • spending money maintaining these systems.

  • We haven't wanted to or we thought

  • we could skate by without it.

  • And then when millions of people suddenly need

  • unemployment checks, the entire system

  • is buried in technical debt.

  • It's a hard lesson, but if we want the reliability

  • that we expect from web services,

  • we're gonna have to pay for it.

  • Thanks for watching.

  • If you want to know more about COBOL

  • and this whole saga, by colleague Makena Kelly

  • wrote a great article in the description.

  • And let us know in the comments

  • if there's anything else you think we should be covering.

- Some of the government's most important websites

Subtitles and vocabulary

Click the word to look it up Click the word to find further inforamtion about it