Category Archives: CS / Science / Tech / Coding

National AI Strategy

Some of my co-workers published a sponsored piece in the Atlantic calling for a national AI strategy, which was tied in to some discussions at the Washington Ideas event.

I'm 100% on board with the US having a strategy, but I want to offer one caveat: "comprehensive national strategies" are susceptible to becoming top-down, centralized plans, which I think is dangerous.

I'm generally disinclined to centralized planning, for both efficiency and philosophical reasons. I'm not going to take the time now to explain why; I doubt anything I could scratch out here would shift people very much along any kind of Keynes-Hayek spectrum.

So why am I bothering to bring this up? Mostly because I think it would be especially ill-conceived to adopt central planning when it comes to AI. The recent progress in AI has been largely a result of abandoning top-down techniques in favor of bottom-up ones. We've abandoned hand-coded visual feature detectors for convolutional neural networks. We've abandoned human-engineered grammar models for statistical machine translation. In one discipline after another emergent behavior has outpaced decades worth of expert-designed techniques. To layer top-down policy-making on a field built of bottom-up science would be a waste, and an ironic one at that.

PS Having spoken to two of the three authors of this piece, I don't mean to imply that they support centralized planning of the AI industry. This is just something I would be on guard against.

Posted in Business / Economics, CS / Science / Tech / Coding | Tagged , , , , | Leave a comment

Will AI steal our jobs?

As an AI researcher, I think I am required to have an opinion about this. Here's what I have to say to the various tribes.

AI-pessimists: please remember that the Luddites have been wrong about technology causing economic cataclysm every time so far. We're talking about several consecutive centuries of wrongness.1 Please revise your confidence estimates downwards.

AI-optimists: please remember that just because the pessimists have always been wrong in the past does not mean that they must always be wrong in the future. It is not a natural law that the optimists must be right. That labor markets have adapted in the long term does not mean that they must adapt, to say nothing of short-term dislocations. Please revise your confidence estimates downwards.

Everyone: many forms of technology are substitutes for labor. Many forms of technology are complements to labor. Often a single form of technology is both simultaneously. It is impossible to determine a priori which effect will dominate.2 This is true of everything from the mouldboard plough to a convolutional neural network. Don't casually assert AI/ML/robots are qualitatively different. (For example, why does Bill Gates think we need a special tax on robots that is distinct from a tax on any other capital equipment?)

As always, please exercise cognitive and epistemic humility.

  1. I am aware of the work of Gregory Clark and others related to Industrial Revolution era wage and consumption stagnation. If a disaster requires complicated statistical models to provide evidence it exists, I say its scale can not have been that disastrous. []
  2. Who correctly predicted that the introduction of ATMs would coincide with an increase in employment of bank tellers? Anyone? Anyone? Beuller? []
Posted in Business / Economics, CS / Science / Tech / Coding | Tagged , , , , , | Leave a comment

Marketing to Algorithms?

Toby Gunton :: Computer says no – why brands might end up marketing to algorithms

I know plenty about algorithms, and enough about marketing.1 And despite that, I'm not sure what this headline actually means. It's eye catching, to be sure, but what would marketing to an algorithm look like?

When you get down to it, marketing is applied psychology. Algorithms don't have psyches. Whatever "marketing to algorithms" means, I don't think it's going to be recognizable as marketing.

Would you call what spammers do to slip past your filters "marketing"? (That's not rhetorical.) Does that count as marketing? Because that's pretty much what Gunton seems to be describing.

Setting aside the intriguing possibility of falling in love with an artificial intelligence, the film [Spike Jonez's Her] raises a potentially terrifying possibility for the marketing industry.

It suggests a world where an automated guardian manages our lives, taking away the awkward detail; the boring tasks of daily existence, leaving us with the bits we enjoy, or where we make a contribution. In this world our virtual assistants would quite naturally act as barriers between us and some brands and services.

Great swathes of brand relationships could become automated. Your energy bills and contracts, water, gas, car insurance, home insurance, bank, pension, life assurance, supermarket, home maintenance, transport solutions, IT and entertainment packages; all of these relationships could be managed by your beautiful personal OS.

If you're a electric company whose customers all interact with you via software daeomns, do you even have a brand identity any more? Aren't we discussing a world in which more things will be commoditized? And isn't that a good thing for most of the categories listed?

What do we really care about: getting goods and services, or expressing ourselves through the brands we identify with? Both, to an extent. But if we can no longer do that through our supermarkets or banking, won't we simply shift that focus it to other sectors: clothes, music, etc.

Arnold Kling :: Another Proto-Libertarian

2. Consider that legislation may be an inferior form of law not just recently, or occasionally, but usually. Instead, consider the ideas of Bruno Leoni, which suggest that common law that emerges from individual cases represents a spontaneous order, while legislation represents an attempt at top-down control that works less well.

I'd draw a parallel to Paul Graham's writing on dealing with spam. Bayesian filtering is the bottom-up solution; blacklists and rule sets are the top-down.

Both of these stories remind me of a couple of scenes in Greg Egan's excellent Permutation City. Egan describes a situation where people have daemons to answer their video phones that have learned (bottom-up) how to mimic your reactions well enough to screen out personal calls from automated messages. In turn marketers have software that learns how to recognize if they're talking to a real person or one of these filtering systems. The two have entered an evolutionary race to the point that people's filters are almost full-scale neurocognitive models of their personalities.

  1. Enough to draw a paycheck from a department of marketing for a few years, at least. []
Posted in Business / Economics, CS / Science / Tech / Coding | Tagged , , , , | Leave a comment

Latitude-Longitude Distance

I thought I would post some of the bite-sized coding pieces I've done recently. To lead off, here's Ruby function to find the distance between two points given their latitude and longitude.

Latitude is given in degrees north of the equator (use negatives for the Southern Hemisphere) and longitude is given in degrees east of the Prime Meridian (optionally use negatives for the Western Hemisphere).

include Math
DEG2RAD = PI/180.0
def lldist(lat1, lon1, lat2, lon2)
  rho = 3960.0
  theta1 = lon1*DEG2RAD
  phi1 = (90.0-lat1)*DEG2RAD
  theta2 = lon2*DEG2RAD
  phi2 = (90.0-lat2)*DEG2RAD
  val = sin(phi1)*sin(phi2)*cos(theta1-theta2)+cos(phi1)*cos(phi2)
  val = [-1.0, val].max
  val = [ val, 1.0].min
  psi = acos(val)
  return psi*rho

A couple of notes:

  1. Everything with val at the bottom is to deal with an edge case that can crop up when you try to get the distance between a point and itself. In that case val should be equal to 1.0, but on my systems some floating-point errors creep in and I get 1.0000000000000002, which is out of range for the acos() function.
  2. This returns the distance in miles. If you want some other unit, redefine rho with the appropriate value for the radius of the earth in your desired unit (6371 km, 1137 leagues, 4304730 passus, or what have you).
  3. This assumes the Earth is spherical, which is a decent first approximation, but is still just that: a first approximation.1

I am currently writing a second version to account for the difference between geographic and geocentric latitude which should do a good job of accounting for the Earth's eccentricity. The math is not hard, but finding ground truth to validate my results against is, since the online calculators I've tried to check against do not make their assumptions clear. I did find a promising suite of tools for pilots, and I'd hope if you're doing something as fraught with consequences as flying that you've accounted for these sorts of things.

Protip: You can win every exchange just by being one level more precise than whoever talked last. Eventually, you'll defeat all conversational opponents and stand alone.
xkcd #1318 — "Protip: You can win every exchange just by being one level more precise than whoever talked last. Eventually, you'll defeat all conversational opponents and stand alone."

  1. As far as I'm concerned, this is my canonical example of the difference between a first and second approximation. The Earth isn't really a oblate spheroid either, but that makes a very good second approximation — about 100 m. (See John Cook here and here.) []
Posted in CS / Science / Tech / Coding | Tagged , , | Leave a comment

Writing software is not a political process

Let's put aside how we personally feel about ObamaCare for a moment. Ignore for the time being any considerations of the politics, economics, efficiency, justice, equity, etc. of the law.1

Let's steer far clear of the Knowledge Problem and the Calculation Debate and other perennial political-economic anlages.

Let's not refer to the NHS, or contemplate citations to the Oregon Medicaid Study or the World Health Report 2000.

Let's certainly not contemplate an alethiological analysis of the utterance "If you like your doctor, you will be able to keep your doctor. Period."

I'm going to leave all that stuff off the account while I explain to you why I'm brimming with epicaricacy at the failure of the exchange launches.

I do feel a bit guilty about my joy. The rational part of me knows this is a major pain for people who had been eagerly awaiting signing up, and further may put the entire system in a death spiral by limiting the enrollment of the "young invincibles." (On the other hand, a system which works from the users' point-of-view but still spits gibberish out the back end would be even worse.)

However, the sub-rational side of me is loving this. Not for any partisan reasons — I'm an "a plague a' both your houses" sort of guy — but rather because it is so satisfying to this geek to see the President,2 his cabinet secretaries, senators, and all the other high and mighty mandarins and viziers of the Beltway brought low before the intransigent reality of Code.

Compilers are even more stubborn than the tide.
Compilers are even more stubborn than the tide.

All these powerful people are learning (one hopes) the painful lesson that so many powerful people before them have learned when confronting technical problems. It does not matter how many laws you can create with the stroke of your pen, nor how many regiments you can order about, nor how many sheriffs or tax collectors or wardens you direct: you can't give orders to Computers. It is nice to see such mighty people forced to acknowledge — as thousands of hapless executives and others have in the past — that things are not as simple as commanding geeky worker bees to make it so. No number of fiats, from however august an authority, can summon software in to being: It must be made.

As my father — a former legislative assistant on the Hill — said, "passing a law requiring the exchanges to be open is like passing a law forbidding people from being sick, and just as effective."

Compilers don't care about oratory or rhetoric. Political capital can't find bugs. Segfaults aren't fixed at whistle-stops or town-halls or photo-ops. No quantity of arm-bending or tongue-wagging or log-rolling or back-scratching can plug memory leaks. You can't hand-shake or baby-kiss your way into working code.

I tend to see two different mistaken attitudes among non-geeks when it comes to how software is actually made. Some people think it's complete magic, which is flattering but utterly wrong. Others see it as "just pressing buttons," which is wrong but utterly arrogant.

Programmers sit at computers, stare at monitors, and type. Which is exactly what J. Random Whitecollar does, so how hard can it be? It is, after all, "just typing" — although in the same way that surgery is just cutting and stitching.3

I have become accustomed — as every CS grad student becomes — to getting emails from founders seeking technical expertise for their start-ups. The majority of these are complete rubbish, written by two troglodytes who imagine that coming up with an idea plus a clever name for a website constitutes the bulk of the work. These emails typically include a line about "just needing someone to create the site/app/program for us." This is a dead give away that these people will make terrible partners. Just create it? Just? You might as well tell a writer that you have an idea for a novel, and could he please just write the book for you?

Begala's "Stroke of the pen; law of the land; kinda cool" is fine for politics. But when it comes to software this doesn't fly.
Begala's "Stroke of the pen; law of the land; kinda cool" is fine for politics. But when it comes to software this doesn't fly.

This is the same attitude I see from the the White House. Not only did they start off the process with the general suits-vs-geeks attitude, they continued at every turn to place precedence on political desires over engineering realities: failing to set realistic deadlines from the start, leaving all details up to the numerous "the secretary shall determine" clauses in the legislation, delaying the date that states must decide if they would run their own exchanges, delaying finalizing what the rules on the back end would be for insurers, HHS insisting on doing the general contracting itself,4 the head-in-the-sand "brisk management" they engaged in when it became clear the deadline would be slipped, etc., etc. Over and over again the political establishment prioritized their own wants over the engineering needs.

This whole situation is a great example of Arnold Kling's "Suits vs. Geeks" divide.

Suits imagine they have the hard job, because that's the only job they know how to do. Yes, the political wrangling is difficult. But we geeks have sat in those frustrating meetings, attempting to get disparate parties on the same page. We've drafted those memos, and written those reports, and had those conference calls. We have to do all that too. When's the last time the suits tried our job? When's the last time they wrestled with memory allocation bug in ad hoc dynamic data structures nested four deep? When have they puzzled out a floating point underflow error? When have they deciphered an undocumented API?

The psychologically easiest response, when confronted with something you have no clue how to do, is to assert that it's simple, and you would easily do it if only you had the inclination and time denied you by having to deal with more rigorous matters.

I don't want to fall in to the opposite trap here of assuming the other guy's job, i.e. the political, non-engineering one, is easy. But let me ask you some questions. How many people in the executive branch have the jobs they do because they donated to a campaign or ran a solid get-out-the-vote drive in a swing state, or did something else politically advantageous to the current occupant of the Oval Office but otherwise entirely unrelated to the department/bureau/administration they now give orders to? And how many political appointees are where they are because they've mastered their craft over tens of thousands of hours of practice?

Now answer those same questions, but substitute "software engineering firm" for "executive branch." What's the ratio of people who get ahead by who-they-know to those who are promoted for what-they-know there? Silicon valley isn't exactly known for sinecures and benefices. On the other hand OPM has entire explicit classes of senior-level officials who are where they are for no other reason than POTUS's say-so. And this isn't some kind of sub-rosa, wink-wink-nudge-nudge thing: this is exactly how the administration is supposed to function.

Let's shift gears and take a look at Charette's (soon-to-be-) classic article, "Why Software Fails." I don't expect the politicians and bureaucrats in charge of this thing to have read K&R or SICP backwards and forwards or have a whole menagerie of O'Reilly books on their shelf. But they at least ought to be familiar with this sort of thing before embarking on a complete demolition and remodel of a sixth of the US economy that was critically dependent on a website.

Here's Charette's list of common factors:

  1. Unrealistic or unarticulated project goals
  2. Inaccurate estimates of needed resources
  3. Badly defined system requirements
  4. Poor reporting of the project's status
  5. Unmanaged risks
  6. Poor communication among customers, developers, and users
  7. Use of immature technology
  8. Inability to handle the project's complexity
  9. Sloppy development practices
  10. Poor project management
  11. Stakeholder politics
  12. Commercial pressures

Let's assume the final one doesn't apply (although I'm sure there were still budget constraints, since I remember multiple proposals all summer and autumn to fix this by throwing more money at it). Other than that, I could find a news story to back up the ObamaCare site making every one of these mistakes other than #7. You're looking at 10 out of 12 failure indicators. Even granting very generous interpretation of events there's no way the Exchanges weren't dealing with at the very least #1, 3, 4, and 6.

I've seen plenty of people on the Right gleefully jeer that this is what happens when you don't have market incentives to guide you. They're right.

I've also seen plenty of people on the Left retort that history is littered with private enterprises that have wasted billions on poorly-executed ambitious IT projects. They're right too.

Of course, they're both wrong as well.

The people on the Right are engaging in a huge amount of survivorship bias. All those companies that screwed up  an IT rollout like this aren't around for us to notice anymore. Maybe they aren't bankrupt, but they're not as salient as their successful competitors either. Failures are obscure, successes are obvious.

The people on the Left are misunderstanding how distributed, complex systems like a market work. Yes, individual agents will fail. That's part of the plan, just like it is in evolution. You can't have survival-of-the-fittest without also having the contrapositive.5 We don't have the freedom to develop via the distributed market-driven exploration process. All of our eggs are in this one basket.

I don't think the people making either claim really grok how a market is supposed to operate. It wouldn't help to give the developers an equity stake or pay them lavish bonuses. You might get more effort from them or a better group of programmers, but you've still only got a single attempt at getting this right. And the people pointing out all the wasted private-sector IT spending are also missing the point. Yes, there are failures, but the entire system relies on failures to find the successes. The ACA does the opposite of that by forcing everyone to adopt the same approach and continuing to disallow purchasing across state lines. That's a recipe for catastrophic loss of diversity and dampening of feedback signals.

I've seen people on all sides suggest that what we really needed to do was go to Silicon Valley and hire some hotshot programmers and give them big paychecks, and they could build this for us lickety-split. Instead, we're stuck paying people mediocre GS salaries (or the equivalent via contractors), so we get mediocre programmers who deliver mediocre product. I don't think this reasoning holds up. Another common observation, which I also think is flawed, is almost the opposite: there was never a way to make this work since good coders in the Valley expect to get equity stakes when they create big, ambitious software products, and no such compensation is possible for a federal contract.

At the margin more money will obviously help. Ceteris paribus, you will get more talented people. But that's not the whole story, by a long-shot.

1. There's several orders of magnitude between the best programmers and the median programmers. You can't even quantify the difference in quality between the best and the worst, because the worst have negative productivity: they introduce more bugs into the code than they fix. Paying marginally more may get you marginally better coders, but there's a qualitative difference between the marginally-above-average and the All Stars.

2. The way you get the best is not often by offering more money. It helps, it's only a piece of the whole story. The way you get the best is by giving them interesting problems to work on.

3. The exchange is not an interesting problem. In fact it's quite the opposite. It's almost entirely what Eric S Raymond calls "glue" — it pastes a bunch of other systems together, but doesn't do anything very interesting on its own. ESR cautions programmers (quite rightly!) to use as little glue as possible. Glue is where errors — and madness — insinuate themselves in to a project.6

This is related to all of the discussion I've heard about how Obama got geeks to volunteer to help him create various tools for his campaigns. If hotshot programmers would do that, the thinking goes, why wouldn't they pitch in to build an awesome exchange?

Simple: because the exchange is boring. It's as bureaucratic as it comes. Working on it would require a massive amount of interfacing with non-technical managers in order to comply with non-trivial, difficult-to-interpret legislative/regulatory rules. Do you know how many lawyers a coder would have to talk to in order to manage a project like this?! Coders are almost as allergic to lawyers as the Nac Mac Feegle are.

All that managerial overhead is no fun at all, especially compared to the warm-and-fuzzies some people feel when they get to participate in the tribal activity of a big election.7

Not only is building the exchanges not fun compared to building a campaign website, but it comes with all sorts of deadlines and responsibilities too. If you think up some little GPS app to point people toward their polling place, but it doesn't work the way you want, or handle a large enough load... no sweat. It was a hobby thing anyway. If it works then you feel good about helping to get your guy elected. If it doesn't then you just move on to the next hobby that strikes your fancy.

Was there a way for the Obama administration to harness some of that energy from the tech community? Yeah. Could they have used open source development to make some of the load lighter? Yeah. But it's no cure-all. At the end of the day there was a lot of fiddly, boring, thankless, unsexy government work to be done.

Let's take a slight detour and discuss Twitter. A professor I know claims that several bad months of performance by is no big deal. After all, Twitter used to be plagued by the Fail Whale but it's a very successful enterprise now.


First of all, they're the exception. People remember the Fail Whale specifically because Twitter is the opposite of a failure now.

Secondly, tweeting is entertaining. Buying insurance isn't. People will put up with more hurdles being put between them and free fun than between them and expensive drudgery.

Thirdly, Twitter never had to worry about its delicate actuarial calculus being thrown off by a non-random sample of users pushing their way through a clogged system.

Fourthly, if Twitter screwed something up all its users were free to walk away — either until things were fixed, or forever. We don't have that option w/r/t

The administration's responses in the last few weeks to the ongoing troubles have been characterized as "legislation by press release." Let's put aside the constitutional/philosophical issue of whether the President is merely tweaking the way a law is executed, as is his wont, or is re-writing the law of the land by presidential motu propio.

I want to point out that this is another area where comparison to Amazon, Netflix, etc. falls short. If Twitter finds out that some part of their design is unimplementable they have complete prerogative to change the design of their service in any way. They can re-write their ToS or feature list or pricing structure however they want, whenever they want. The State utterly lacks such range of motion and nimbleness. There is thus even less point in people on either the Red Team or Blue Team saying "well the private sector builds massive IT projects all the time." They aren't playing the same game.

Jay Carney et al. have been insisting all along that everything is working (or will be working, or should have been working, or whatever the line is today), and the only problem is it's a bit slow, as if this is a trivial matter. I don't think people realize how relentlessly commerce websites are engineered to remove all the slowness. And I mean all the slowness. Every millisecond of delay costs you sales. Every slowdown lowers your conversion rate. Tens of milliseconds are a big deal.8 Having delays delays measures in minutes is unspeakably bad. Delays in the hours are no longer "delays" — they mean the system doesn't work.

(If you don't believe me then you can do a little experiment. If you're using Chrome, open up a new tab, then go to View > Developer > Developer Tools and click on the Network tab. [I know other browsers have a similar function, but I don't remember what they call it off the top of my head.] Once you see that, go back to the tab you just opened and load You'll see all the various files needed to display their page listed in the timeline. Note that the "latency" column is measured in milliseconds. If delays of several minutes were just part of doing business, this isn't how developers would want something reported.)

(Update: here's a look at what the exchange sales funnel actually looks like. Not good, especially for the unsubsidized consumers. And considering this is a product we're required to buy. [How well would Amazon do if they had the IRS requiring you to buy books every year in the name of increased national literacy?] Oh, and considering we don't know who will actually end up paying their bills. And is anyone else a little suspicious at how hard it is to get these numbers? What happened to all the promises of freely shared government data from "the most transparent administration ever"? How does that mesh with not releasing how many people have actually purchased a plan?) sales funnel sales funnel

These lengthy delays are actually worse than the exchanges not working at all. We'd be better off if they never opened. The healthy kid who's buying a policy because he's told he has to is going to be put off by these delays, but the sick old-timer with diabetes and a bad hip isn't. So rather than not getting any customers, you're getting just the expensive ones. (I feel like we need a sound effect or musical theme to play when the Death Spiral is about to come on stage. Maybe something from "Mars, Bringer of War"?)

This all leads us to Brisk Management and Failing on Time. These are very important engineering management concepts. This post is already dragging on much to long, so I'll summarize in one sentence: it is a huge mistake to take on extra risks just to hit an arbitrary calendar deadline. Or if you'd prefer a sentence with more imagery: it's better that a building take longer to finish than have it done on time but collapse later. The health exchanges look like a textbook case of Failing on Time. Obama was reassuring everyone that signing up was going to be just like shopping on Amazon or Kayak a week — one week! — before the missed launch date. The first missed launch date, that is.

Many of the problems of the exchange implementation were apparent even on paper, in the planning stages. For instance: just how is supposed to calculate subsidies? That will require a real-time verification of your income. From whence will this information come? The IRS knows a scary amount about us, but it doesn't know until deep into next year how much you made this month. They don't have some server with an API standing by to answer queries like getCurrentIncome(<SSN>).9 So it was pretty inevitable that this feature would be abandoned in favor of the honor system. Which is unfortunate, because I remember ObamaCare supporters swearing up and down that it was completely absurd for their opponents to raise concerns about people hustling the system for subsidies they didn't qualify for. (Not to mention the equivalent assumptions the CBO was forced to make.)

This post is already orders of magnitude longer than I expected, so I'm going to toss in a handful of links to a couple of other people's posts without comment. There were many, many more I could put here, but keeping track of all the ink spilled on this is impossible.

The last four are by Megan McArdle, who is not only one of the most cogent econ-bloggers out there, she also worked as an IT consultant, so she has had a lot of valuable perspective to contribute.

I'll close with this, from Ellen Ullman's excellent memoir Close to the Machine. Ullman was (is?) a card-carrying communist. I mention that so you know she's no anti-government right-wing Tea Party ideologue. This passage describes her experience in the early 90s as the lead developer on a San Francisco project building a computer system to unify all the city's AIDS-related efforts. She started the project over-joyed to be working for "the good guys" instead of some profit-maximizing robber barons, but very quickly it turns in to this:

Next came the budget and scheduling wrangles. Could the second phase be done in December? At first I tried what may be the oldest joke known to programming managers—"Sure you can have it in December! Of What year?"—but my client was in deadly earnest. "There is a political deadline," they said,"and we can't change it." It did no good to explain that writing software was not a political process. The deed was done. They had gone around mentioning various dates—dates chosen almost at random, imagined times, wishes—and the mentioned dates soon took on an air of reality. To all the world, to city departments and planning bureaus, to task forces and advisory boards, the dates had become expectations, commitments. Now there was no way back. The date existed and the software would be "late." Of course, this is the way all software projects become "late"—in relation to someone's fantasy that is somehow adopted as real—but I didn't expect it so soon at the AIDS project, place of "helping people," province of "good."

I asked, "What part of the system would you like me not to do?"

"You tell us," they said.

"This one. This piece here can't be done on time."

"But we must ace that one! It's a political requirement."

Round and round: the same as every software project, any software project. The same place.

(Ellen Ullman, Close to the Machine, pp. 82–83.)

  1. After all, you, dear readers, are strangers to me, and I find it slightly uncivilized to discuss politics, religion or sex with strangers. []
  2. This president, no less, so lovingly and recently hailed as "the iPod President." []
  3. One should be as loathe to hire inexperienced coders for mission critical software as one would be to hire a butcher and a tailor to collaborate on an appendectomy. []
  4. Yes, really. Because they have so much experience with this sort of thing. []
  5. ahem *bailouts* ahem! []
  6. See Eric S Raymond, "The Art of Unix Programming," 2003. This may be a little advanced for a legislator or administrator to read, but is is that much to ask that the people governing these critical systems learn a little but about how they work? []
  7. Is it Robin Hanson who has the theory about political engagement being another form of team sport and spectation? []
  8. And for context, conscious thought is best described on a scale of hundreds of milliseconds, so delays that are nearly too brief to perceive lower you chance of completing a transaction by a noticeable amount. []
  9. If you don't believe me then you should have been around when I was trying to convince Sallie Mae and the Department of Education of my family's correct income was so they could calculate our loan repayments. It took about nine months to convince them that my wife, a teacher, is paid 10 months a year and as a result you can't just multiply her biweekly wages by 26 to get annual income. There are four million teachers in the US, so it's not exactly like this was some rare exception they had to cope with. I wish there was some IRS system for quickly verifying income, because it would have saved me most of a year of mailing in pay stubs and 1040s and W-2s and offer letters and triplicate forms, by which point, of course, the information was out of date and we had to start over.

    Sorry to get off on a tangent here, but the federal government is so bad at technology I just can't let this go. And actually, it's not much of a tangent when you consider it was the PPACA that spearheaded the semi-nationalization of the student loan industry. (Drat. I need a footnote for this footnote. A couple of the very important concepts you can learn from "The Art of Unix Programming" (note 7 supra) are the principles of Compactness and Orthogonality. Both of these, and particularly the latter, should be rules for legislation as well. Folding student loan reform into the PPACA in order to game the CBO scoring is a pretty clear violation of both of these principles.)

    Compared to health insurance, a student loan is a pretty simple thing. Have you had to deal with It's atrocious. Recently they changed the repayment plan that my wife was on without notifying us. That's bad enough. The ugly part is that when they do that, they don't change the displayed label on your account that tells you which plan you're in, so even if you proactively check for changes you won't find out. And the truly hideous thing is that they don't change the label on the info screens that their own representatives can see either, so if you call to verify you still won't find out! It's true that dealing with the banks before was a complete mess, but I chalk that up to the absence of a right-of-exit for consumers. That was bad enough then, but post-nationalization I'm really over a barrel.

    Getting people signed up is only the first skirmish for All these sorts of ongoing problems, such as the ones I've experienced with student loans, will constitute the bulk of the IT battle, and they have not yet even begun to show up yet. []

Posted in Business / Economics, CS / Science / Tech / Coding | Tagged | 1 Comment

Kaggle Black Box

This is the second machine learning competition hosted at Kaggle that I've gotten serious about entering and sunk time into only to be derailed by a paper deadline. I'm pretty frustrated. Since I didn't get a chance to submit anything for the contest itself, I'm going to outline the approach I was trying here.

First a bit of background on this particular contest. The data is very high dimensional (1875 features) and multicategorical (9 classes). You get 1000 labeled training points, which isn't nearly enough to learn a good classifier on this data. In addition you get ~130000 unlabeled points. The goal is to leverage all the unlabeled data to be able to build a decent classifier out of the labeled data. To top it off you have no idea what the data represents, so it's impossible to use any domain knowledge.

I saw this contest a couple of weeks ago shortly after hearing a colleague's PhD proposal. His topic is the building networks of Kohonen Self-Organizing Maps for time series data, so SOMs are where my mind went first. SOMs are a good fit for this task: they can learn on labeled or unlabeled data, and they're excellent at dimensionality reduction.

An SOM of macroeconomic features. From Sarlin, "Exploiting the self-organizing financial stability map," 2013.
An SOM of macroeconomic features. From Sarlin, "Exploiting the self-organizing financial stability map," 2013.

My approach was to use the unlabeled training data to learn a SOM, since they lend themselves well to unsupervised learning. Then I passed the labeled data to the SOM. The maximally active node (i.e. the node whose weight vector best matches the input vector, aka the "best matching unit" or BMU) got tagged with the class of that training sample. Then I could repeat with the test data, and read out the class(es) tagged to the BMU for each data point.

So far that's simple enough, but there is far too much data to learn a SOM on efficiently,1 so I turned to my old ensemble methods.

[1] SOM bagging. The most obvious approach in many ways. Train each network on only a random subset of the data. The problem here is that any reasonable fraction of the data is still too big to get into memory. (IIRC Breiman's original Bagging paper used full boostraps, i.e. resamples the same size as the original set and even tested using resamples larger than the original data. That's not an option for me.) I could only manage 4096 data points (a paltry 3% of the data set) in each sample without page faulting. (Keep in mind again that a big chunk of this machine's memory was being used on my actual work.)

[2] SOM random dendrites. Like random forests, use the whole data set but only select a subset of the features for each SOM to learn from. I could use 64 of 1985 features at a time. This is also about 3%; the standard is IIRC more like 20%.

In order to add a bit more diversity to ensemble members I trained each for a random number of epochs between 100 and 200. There are a lot of other parameters that could have been adjusted to add diversity: smoothing, distance function and size of neighborhoods, size of network, network topology, ...

This is all pretty basic. There tricky part is combining the individual SOM predictions. For starters, how should you make a prediction with a single SOM? The BMU often had several different classes associated with it. You can pick whichever class has a plurality, and give that network's vote to that class. You can assign fractions of its vote in proportion to the class ratio of the BMU. You can take into account the distance between the sample of the BMU, and incorporate the BMU's neighbors. You can use a softmax or other probabilistic process. You can weight nodes individually or weight the votes of each SOM. This weighting can be done the traditional way (e.g. based on accuracy on a validation set) or in a way that is unique to the SOM's competitive learning process (e.g. how many times was this node the BMU? what is the distance in weight-space between this node and its neighbors? how much has this node moved in the final training epochs?).

At some point I'm going to come back to this. I have no idea if Kaggle keeps the infrastructure set up to allow post-deadline submissions, but I hope they do. I'd like to get my score on this just to satisfy my own curiosity.

This blackbox prediction concept kept cropping up in my mind while reading Nate Silver's The Signal and the Noise. We've got all these Big Questions where we're theoretically using scientific methods to reach conclusions, and yet new evidence rarely seems to change anyone's mind.

Does Medicaid improve health outcomes? Does the minimum wage increase unemployment? Did the ARRA stimulus spending work? In theory the Baicker et al. Oregon study, Card & Krueger, and the OMB's modeling ought to cause people to update beliefs but they rarely do. Let's not even get started on the IPCC, Mann's hockey stick, etc.

So here's what I'd like to do for each of these supposedly-evidence-based-and-numerical-but-not-really issues. Assemble an expert group of econometricians, modelers, quants and so on. Give them a bunch of unlabeled data. They won't know what problem they're working on or what any of the features are. Ask them to come up with the best predictors they can.

If they determine minimum wages drive unemployment without knowing they're looking at economic data then that's good evidence the two are linked. If their solution uses Stanley Cup winners but not atmospheric CO2 levels to predict tornado frequency then that's good evidence CO2 isn't a driver of tornadoes.

I don't expect this to settle any of these questions once-and-for-all — I don't expect anything at all will do that. There are too many problems (who decides what goes in the data set or how it's cleaned or scaled or lagged?). But I think doing things double-blind like this would create a lot more confidence in econometric-style results. In a way it even lessens the data-trawling problem by stepping into the issue head-on: no more doubting how much the researchers just went fishing for any correlation they could find, because we know that's exactly what they did, so we can be fully skeptical of their results.

  1. I also ran up against computational constraints here. I'm using almost every CPU cycle (and most of the RAM) I can get my hands on to run some last-minute analysis for the aforementioned paper submission, so I didn't have a lot of resources left over to throw at this. To top it off there's a bunch of end-of-semester server maintenance going on which both took processors out of the rotation and prevented me from parallelizing this the way I wanted. []
Posted in Business / Economics, CS / Science / Tech / Coding | Tagged , , | Leave a comment