Some Long Overdue Book Reviews

More Money Than God Cover
"More Money Than God," Mallaby

More Money Than God, Sebastian Mallaby

Excellent. Far too many general audience finance books are written at what I think of as a newspaper reading level. (Defining even the most basic terms, assuming the reader is intimidated by any math as complicated as calculating a percentage, feeling the need to frame everything in a protagonist/antagonist arrangement, etc.) This is way, way better than that. There's an appropriate mix of the human element in there. The more factual stuff is covered well without needing resorting to lots of technicalities. (I have other books for that.)

It's hard to draw large conclusions from this book. One that comes to mind is that hedge funds seem to fail (either catastrophically, or in the more prosaic sense of failing to deliver the expected alpha) when they stop being hedge funds, where "hedge" is the operative word.

This book also made me start thinking more about the connection between financial risk and ecological monoculture. Trading strategies seem to have a discrete lifespan. Traders seem to underestimate how their strategies will be affected by being in a crowded field of other people with the same strategy. LTCM is a good example. Their system worked very well for a while and then crashed and burned. Is it that the system was actually bad all along, or was it great when they were the only ones doing it, and terrible when everyone else in the market was copying them and putting on the same trades? I don't think you can judge a strategy in isolation; you need to consider it's utility in both crowded and sparse niches.


PanicCover
"Panic," Lewis

Panic: The Story of Modern Financial Insanity, Michael Lewis

Very well curated. File under: "nihil sub sole novum," "the more things change," etc.

I think the only piece I would have left out, IIRC, was the Paul Krugman one. But that has more to do with being utterly exhausted at trying to reconcile vintage 1990s Krugman-the-Scholar with late model Krugman-the-Demogogue.


Saga #5
Saga #5

Saga, Volume Two, Brian K Vaughn + Fiona Staples

Saga is still absolutely brilliant. The story and art are both outstanding. Comics needs more Space Opera. The genre cries out for a visual medium, but the budget required to do something like this in film would be off the charts. Only James Cameron gets the opportunity to try something like that. (Although after Pacific Rim maybe del Toro will get the chance too. Or perhaps Neill Blomkamp if Elysium rakes in enough. Sign me up for some widescreen baroque space opera directed by either of them.)


HackersPaintersCover
"Hackers and Painters,"
Graham

Hackers and Painters, Paul Graham

I read most of these essays back in undergrad but it's great to revisit them. It's interesting how the things that have stuck in my mind aren't the major theses of the essays, but little asides and trivialities. Every CS student and programmer should read this. I think it would also make a good read for the family members, managers, etc. of those people too: anyone who wants to understand how we think and see would benefit. Even when I read Graham discussing completely non-technical subjects (e.g. adolescents and popularity) there's something in his method of analysis which resonates with me as distinctly hackerish. On the flip side, it's nice to have someone else in the computing community who is interested in Art. I would need a whole Paul Graham-level essay to unpack this, but I think there's an unfortunate degree of antagonism between the geek and art tribes.


Unseen Academicals, Terry Pratchett

This is one of my favorite Discworld books so far. I didn't realize going in that the focus of Pratchett's satire here is not just academia but also soccer/football culture.


The War of Art, Steven Pressfield

Too superstitious and mystical, but I think there are a lot of overlaps between the way scientists (and especially doctoral students) work and the way writers and artists work. Learning about how various writers (e.g. Neal Stephenson, DFW) work has helped me to be a better researcher.


A Red Mass for Mars, Jonathan Hickman + Ryan Bodenheim

A little hard to follow the plot, but absolutely gorgeous. Hickman consistently turns out books that are so visually different from most comics. Here there's a great contrast, similar to what he did in Pax Romana, between the stark black inking and the luminous aquarelle of the backgrounds.
Red_Mass_for_Mars_p7

Posted in Business / Economics, Reviews | Tagged , | Leave a comment

Reading List for 23 September 2013

Arnold Kling :: Big Gods

Here is a question to think about. If religions help to create social capital by allowing people to signal conscientiousness, conformity, and trustworthiness [as Norenzayan claims], how does this relate to Bryan Caplan’s view that obtaining a college degree performs that function?

That might explain why the credentialist societies of Han China were relatively irreligious. Kling likes to use the Vickies/Thetes metaphor from Neal Stephenson's Diamond Age, and I think this dichotomy could play well with that. Wouldn't the tests required by the Reformed Distributed Republic fill this role, for instance?

Ariel Procaccia :: Alien journals

Steve Landsburg :: RIP, Ronald Coase

This is by far the best, simplest explanation of Coase's insights that I have read. Having read plenty of Landsburg, that should not — indeed does not — surprise me.

His final 'graph is a digression, but a good point:

Coase’s Nobel Prize winning paper is surely one of the landmark papers of 20th century economics. It’s also entirely non-technical (which is fine), and (in my opinion) ridiculously verbose (which is annoying). It’s littered with numerical examples intended to illustrate several different but related points, but the points and the examples are so jumbled together that it’s often difficult to tell what point is being illustrated... Pioneering work is rarely presented cleanly, and Coase was a true pioneer.

And this is why I put little stock in "primary sources" when it comes to STEM. The intersection between people/publications who originate profound ideas and people/publications which explain profound ideas well is a narrow one. If what you want is the latter, don't automatically mistake it for the former. The best researchers are not the best teachers, and this is true as much for papers as it is for people.

That said, sometimes the originals are very good. Here are two other opinions on this, from Federico Pereiro and John Cook.

Prosthetic Knowledge :: Prototypo.io

Start a font by tweaking all glyphs at once. With more than twenty parameters, design custom classical or experimental shapes. Once prototyping of the font is done, each point and curve of a glyph can be easily modified. Explore, modify, compare, export with infinite variations.

I liked this better when it was called Metafont.

Sorry, I couldn't resist some snark. I actually do like this project. I love both Processing and typography, so why wouldn't I? Speaking of which...

Hoefler & Frere-Jones :: Pilcrow & Capitulum

Some sample pilcrows from the H&FJ foundry.
Some sample pilcrows from the H&FJ foundry.

Eric Pement :: Using SED to make indexes for books

That's some impressive SED-fu.

Mike Duncan :: Revolutions Podcast

(Okay, so technically this may not belong on a "reading list.") Duncan previously created The History of Rome podcast, which is one of my favorites. Revolutions is his new project, and it just launched. Get on board now.

Kenneth Moreland :: Diverging Color Maps for Scientific Visualization [pdf]

Ardi, Tan & Yim :: Color Palette Generation for Nominal Encodings [pdf]

These two have been really helpful in the new visualization project I'm working on.

Andrew Shikiar :: Predicting Kiva Loan Defaults

Brett Victor :: Up and Down the Ladder of Abstraction: A Systematic Approach to Interactive Visualization

This would be a great starting place for high-school or freshmen STEM curricula. As a bonus, it has this nice epigraph from Richard Hamming:

"In science, if you know what you are doing, you should not be doing it. In engineering, if you do not know what you are doing, you should not be doing it. Of course, you seldom, if ever, see either pure state."

Megan McArdle :: 13 Tips for Jobless Grads on Surviving the Basement Years

I'm at the tail end of a doctoral program and going on the job market. This is good advice. What's disappointing is that this would have been equally good and applicable advice for people going on the job market back when I started grad school. The fact that we're five years (!!) down the road and we still have need of these sorts of "surviving in horrid job markets" pieces is bleak.

Posted in Reading Lists | Tagged , , , , | Leave a comment

Groceries

Arnold Kling :: The Costco Business Model

Eventually, I could imagine an equilibrium in which a store like Giant pares back on the number of items it sells in the store, keeping only the most popular items available. You would have to order less-popular items on line. That way, they could cut back on those restocking costs.

Of course, for all I know, Giant’s business model is to charge a big markup on stuff, and they figure once they get you into the store they make a profit. And if stocking a great variety of items gets you into the store, then that is the right strategy.

I don't think it can work that way. The grocery business is a lot like an old rule of thumb we have in software: 80% of users only use 20% of features. The trick is that everyone uses a different 20%. You can't implement 20% of the features from your software and expect retain 80% of the users.

Similarly a grocer is forced to stock everything a customer might expect to get for their weekly shop. If there are one or two items out of 100 that I can't get, then I have to go to another store. And once I'm there, I'm getting everything I need there. When you stopped stocking that $4 jar of olives you haven't just lost $4 in revenue from me, you've lost several thousand dollars annually because I've taken all of my custom elsewhere.

I believe a few stores (Aldi, IIRC) have managed to pare down the selection and have customers put up with it, but that's because the customers understand what they gain is exceptionally low prices. There's an explicit deal being made between them. That's not a model a median store like Giant could go to. Similarly Trader Joe's has managed to carve out a niche of stocking very few SKUs, but virtually no one expects it to be a comprehensive shopping trip. They can get away with that specifically because they're the exception to the rule. There's not room in the market for everyone to play that niche game.

The idea of having people order in advance online is interesting. It could be nice to have a system a little like a comic book shop, where you go in and the store already has a sack of the things you've pre-ordered waiting for you. Ultimately I don't think it will work for two reasons: (1) most people want to pick their produce out themselves, which has hampered PeaPod and all the other online grocery delivery businesses; (2) the vast majority of people will not put any effort into planning their shopping trip in advance.

Almost no one actually writes out a comprehensive grocery list, to say nothing of firing up a computer and pre-purchasing. I saw some stats on grocery planning on a personal finance blog at some point, but I can't find them. Luckily we don't need them: just keep an eye out for how many people in your local store have anything resembling a list. In my experience the intersection in the sets "people who have a grocery list" and "people without babies/toddlers with them" is usually just me. And even I've only got four or five items scrawled on an index card, not my whole list.

I could be quite wrong about this. Most of what I understand of the grocery business comes from extending micro-econ first principals and reading Management in Ten Words, by Terry Leahy, the ex-CEO of Tesco. Still, the margins on groceries stores are famously thin — about 2% IIRC — so we know the market is very competitive. The adoption of a near-universal strategy in an environment like that means that we should at least start with the assumption that that strategy is very efficient, if not near optimal for current conditions.


The McArdle post which Kling quotes in the post above had this very good line:

If you want Wal-Mart to have a labor force like Trader Joe’s and Costco, you probably want them to have a business model like Trader Joe’s and Costco — which is to say that you want them to have a customer demographic like Trader Joe’s and Costco.

It's good to keep in mind these things are not mutually orthogonal.


See also this week's EconTalk with Mike Munger, which is largely about grocery stores and consumer sovereignty. (And, incidentally, how naive Michael Pollan is about business.)

Posted in Business / Economics | Leave a comment

Reading List for 16 July 2013

Evan Miller :: Winkel Tripel Warping Trouble or "How I Found a Bug in the Journal of Surveying Engineering"

All programming blogs need at least one post unofficially titled “Indisputable Proof That I Am Awesome.” These are usually my favorite kind of read, as the protagonist starts out with a head full of hubris, becomes mired in self-doubt, struggles on when others would have quit, and then ultimately triumphs over evil (that is to say, slow or buggy computer code), often at the expense of personal hygiene and/or sanity.

I'm a fan of the debugging narrative, and this is a fine example of the genre. I've been wrestling with code for mapping projections recently, so I feel Miller's pain specifically. In my opinion the Winkel Tripel is mathematically gross, but aesthetically unsurpassed. Hopefully I'll find some time in the next week or so to put up a post about my mapping project.

Irene Global Tweets WInkel Tripel
A screenshot of a project I've been working on to map geotagged tweets.

Kevin Grier :: Breaking down the higher ed wage premium

wage premium by major
Wage premium and popularity of majors

File under "all college degrees are not created equal" or perhaps "no, junior, you may not borrow enough to buy a decent house in order to get a BA in psych."

Aleatha Parker-Wood :: One Shot vs Iterated Games

Social cohesion can be thought of as a manifestation of how "iterated" people feel their interactions are, how likely they are to interact with the same people again and again and  have to deal with long term consequences of locally optimal choices, or whether they feel they can "opt out" of consequences of interacting with some set of people in a poor way.

Mike Munger :: Grade Inflation? Some data

Munger links to some very good analysis but it occurs to me that what is really needed is the variance of grades over time and not just the mean. (Obviously these two things are related since the distribution is bounded by [0, 4]. A mean which has gone from 2.25 to 3.44 will almost certainly result in less variance here.)

I don't much care where the distribution is centered. I care how wide the distribution is — that's what lets observers distinguish one student from another. Rankings need inequality. Without it they convey no information.

Marginal Revolution :: Alex Tabarrok :: The Battle over Junk DNA

I share Graur's and Tabarrok's wariness over "high impact false positives" in science. This is a big problem with no clear solutions.

The Graur et al. paper that Tabarrok discusses is entertaining in its incivility. Sometimes civility is not the correct response to falsehoods. It's refreshing to see scientists being so brutally honest with their opinions. Some might say they are too brutal, but at least they've got the honest part.

Peter McCaffrey :: 5 reasons price gouging should be legal: Especially during disasters

McCaffrey is completely right. But good luck to him reasoning people out of an opinion they were never reasoned into in the first place.

I do like the neologism "sustainable pricing" that he introduces. Bravo for that.

I would add a sixth reason to his list: accusations of "price gouging" are one rhetorical prong in an inescapable triple bind. A seller has three MECE choices: price goods higher than is common, the same as is common, or lower than is common. These choices will result in accusations of price gouging, collusion, and anti-competitive pricing, respectively. Since there is no way to win when dealing with people who level accusations of gouging, the only sensible thing to do is ignore them.

Shawn Regan :: Everyone calm down, there is no “bee-pocalypse”

Executive summary: apiarists have agency, and the world isn't static. If the death rate of colonies increases, they respond by creating more colonies. Crisis averted.

Eliezer Yudkowsky :: Betting Therapy

"Betting Therapy" should be a thing. You go to a betting therapist and describe your fears — everything you're afraid will happen if you do X — and then the therapist offers to bet money on whether it actually happens to you or not. After you lose enough money, you stop being afraid.

Sign me up.

Posted in Reading Lists | Tagged , , , , , , | Leave a comment

Some recent, brief book reviews

Fairy Tales from the Brothers Grimm, Philip Pullman

brothers_grimm_pullman_cover
"Fairy Takes from the Brothers Grimm," Pullman

I knew these were darker than Disney (and everyone else in the 20th C.) would have children believe, but wow. I think there was a stretch of seven stories in a row in which at least one person was casually executed. Cinderella's avian helpers not only dress her up nice and pretty before the soirees, they peck out her step-sisters' eyes! For the sake of professionalism I won't discuss what wakes up Briar Rose or turns the Frog Prince into a man, but let's just say they're a bit more intimate than the chaste smooches that are typically depicted.

Pullman does a first-rate job editing, especially since the various versions the Grimms published are a self-contradicting mess. He's managed to whip some of the stories into shape without losing their pre-modern, fever-dream, nonsensical character. He includes a brief analysis at the end of each story which I would have liked to have even more of. There are also lists of similar stories from other cultures, which given a large windfall of time I would like to track down. Pullman deserves credit for really editing this, not "editing" it in the way that David Foster Wallace discusses editing the Best American Essays 2007. (See his forward, Deciderization 2007: A Special Report, [fulltext pdf] included in his posthumous 2012 collection Both Flesh and Not.


Our Tragic Universe, Scarlett Thomas

Very disappointing. I previously read Thomas' PopCo, which I loved, and whose cleverness and vitality only further overshadows OTU. I'll save you a lot of trouble: the main character of OTU is a struggling author who is debating writing a novel in which nothing happens — a "story-less story." OTU is, eo ipso, a novel in which nothing happens. The end.


Special Topics in Calamity Physics, Marisha Pessl

Would have been twice as good if it was half as long. ("Don't worry about getting to your point; I am going to live forever.")

"Don't worry about getting to yoru points; I am going to live forever."

The conceit of using citations to reference works as similes was clever, especially since the narrator was an over-acheiving college freshman, but it wore out quickly.


Proust and the Squid: The Story and Science of the Reading Brain, Maryanne Wolf

"Proust and the Squid," Wolf
"Proust and the Squid," Wolf

Wolf did a sterling job balancing the history, neurobiology and pedagogy of reading and writing. I was (am?) dyslexic, so this was of special autobiographical interest to me. If I am reading her correctly, my difficulty learning to read when young, my continued ineptitude at foreign language and music, and my visuospatial and pattern recognition interests and skills are actually all rooted in the same cause and not independent as I had casually assumed.

Wolf has also informally tracked which sub-careers dyslexics end up in. Dyslexic doctors are more likely to be great radiologists, for instance, since it requires more visuospatial cognition. I was fascinated to learn that dyslexics in business gravitate to finance, and those in computing towards AI/ML/Pattern Recognition and Graphics/Vision. Those interests fit me perfectly. Score one for being neuro-atypical, I guess.


The Night Circus, Erin Morgenstern

It would be a ton of fun to do visual effects for a film adaptation of this. The illusions are brilliantly described. Actually all of the visual imagery is very well done. These characters would make good fodder for 15-minute sketch exercises like Chris Schweiser used to post.


"Some Remarks," Stephenson
"Some Remarks," Stephenson

Some Remarks, Neal Stephenson

Be advised the pluarilty of this is a single essay from the mid 90s about laying undersea fiberoptic cable. Stephenson manages to make that more interesting than I would have thought possible, but I picked this up looking forward to multiple, bite-sized essays so a single 125 page piece was tough to swallow.

(Side note: a recent BusinessWeek had a map of current undersea cables, and FLAG, which is the focus of Stephenson's essay and was bleeding edge in 1996, dwarfing previous cables' capacity by magnitudes, was just barely big enough to even make it onto the map 17 years later.)

It was also a little odd reading all these interviews and essays which revolve around the progression of Stephenson's career and how it relates to his "Baroque Cycle" since he's published three novels between that and Some Remarks and all of them are very different. The commentary has been left behind by events. It feels like picking up a book written in 1985 that's full of interviews with Reagan about transitioning from actor to SAG president to GE-backed orator but completely ignores his becoming governor and then president.


The Art Forger, B.A. Shapiro

Not my usual kind of book, but still a lot of fun. It was a nice coincidence that I started reading this right as I began watercolor classes. You can tell Shapiro has a real passion for art; various passages really got me fired up to work on my own stuff (both digital and aqueous). She does a good job describing the artistic process in terms of both physical and internal manifestations.


"Stardust," Gaiman
"Stardust," Gaiman

Stardust, Neil Gaiman

Gaiman described this as "a fairy tale for adults," which is extremely apt. It diverges pretty significantly from the film version. Unsurprisingly I like the book version better, but this is one of those rare times when I don't find the movie to be drastically inferior.

Posted in Reviews | Tagged | Leave a comment

Kaggle Black Box

This is the second machine learning competition hosted at Kaggle that I've gotten serious about entering and sunk time into only to be derailed by a paper deadline. I'm pretty frustrated. Since I didn't get a chance to submit anything for the contest itself, I'm going to outline the approach I was trying here.

First a bit of background on this particular contest. The data is very high dimensional (1875 features) and multicategorical (9 classes). You get 1000 labeled training points, which isn't nearly enough to learn a good classifier on this data. In addition you get ~130000 unlabeled points. The goal is to leverage all the unlabeled data to be able to build a decent classifier out of the labeled data. To top it off you have no idea what the data represents, so it's impossible to use any domain knowledge.

I saw this contest a couple of weeks ago shortly after hearing a colleague's PhD proposal. His topic is the building networks of Kohonen Self-Organizing Maps for time series data, so SOMs are where my mind went first. SOMs are a good fit for this task: they can learn on labeled or unlabeled data, and they're excellent at dimensionality reduction.

An SOM of macroeconomic features. From Sarlin, "Exploiting the self-organizing financial stability map," 2013.
An SOM of macroeconomic features. From Sarlin, "Exploiting the self-organizing financial stability map," 2013.

My approach was to use the unlabeled training data to learn a SOM, since they lend themselves well to unsupervised learning. Then I passed the labeled data to the SOM. The maximally active node (i.e. the node whose weight vector best matches the input vector, aka the "best matching unit" or BMU) got tagged with the class of that training sample. Then I could repeat with the test data, and read out the class(es) tagged to the BMU for each data point.

So far that's simple enough, but there is far too much data to learn a SOM on efficiently, ((I also ran up against computational constraints here. I'm using almost every CPU cycle (and most of the RAM) I can get my hands on to run some last-minute analysis for the aforementioned paper submission, so I didn't have a lot of resources left over to throw at this. To top it off there's a bunch of end-of-semester server maintenance going on which both took processors out of the rotation and prevented me from parallelizing this the way I wanted.)) so I turned to my old ensemble methods.

[1] SOM bagging. The most obvious approach in many ways. Train each network on only a random subset of the data. The problem here is that any reasonable fraction of the data is still too big to get into memory. (IIRC Breiman's original Bagging paper used full boostraps, i.e. resamples the same size as the original set and even tested using resamples larger than the original data. That's not an option for me.) I could only manage 4096 data points (a paltry 3% of the data set) in each sample without page faulting. (Keep in mind again that a big chunk of this machine's memory was being used on my actual work.)

[2] SOM random dendrites. Like random forests, use the whole data set but only select a subset of the features for each SOM to learn from. I could use 64 of 1985 features at a time. This is also about 3%; the standard is IIRC more like 20%.

In order to add a bit more diversity to ensemble members I trained each for a random number of epochs between 100 and 200. There are a lot of other parameters that could have been adjusted to add diversity: smoothing, distance function and size of neighborhoods, size of network, network topology, ...

This is all pretty basic. There tricky part is combining the individual SOM predictions. For starters, how should you make a prediction with a single SOM? The BMU often had several different classes associated with it. You can pick whichever class has a plurality, and give that network's vote to that class. You can assign fractions of its vote in proportion to the class ratio of the BMU. You can take into account the distance between the sample of the BMU, and incorporate the BMU's neighbors. You can use a softmax or other probabilistic process. You can weight nodes individually or weight the votes of each SOM. This weighting can be done the traditional way (e.g. based on accuracy on a validation set) or in a way that is unique to the SOM's competitive learning process (e.g. how many times was this node the BMU? what is the distance in weight-space between this node and its neighbors? how much has this node moved in the final training epochs?).

At some point I'm going to come back to this. I have no idea if Kaggle keeps the infrastructure set up to allow post-deadline submissions, but I hope they do. I'd like to get my score on this just to satisfy my own curiosity.


This blackbox prediction concept kept cropping up in my mind while reading Nate Silver's The Signal and the Noise. We've got all these Big Questions where we're theoretically using scientific methods to reach conclusions, and yet new evidence rarely seems to change anyone's mind.

Does Medicaid improve health outcomes? Does the minimum wage increase unemployment? Did the ARRA stimulus spending work? In theory the Baicker et al. Oregon study, Card & Krueger, and the OMB's modeling ought to cause people to update beliefs but they rarely do. Let's not even get started on the IPCC, Mann's hockey stick, etc.

So here's what I'd like to do for each of these supposedly-evidence-based-and-numerical-but-not-really issues. Assemble an expert group of econometricians, modelers, quants and so on. Give them a bunch of unlabeled data. They won't know what problem they're working on or what any of the features are. Ask them to come up with the best predictors they can.

If they determine minimum wages drive unemployment without knowing they're looking at economic data then that's good evidence the two are linked. If their solution uses Stanley Cup winners but not atmospheric CO2 levels to predict tornado frequency then that's good evidence CO2 isn't a driver of tornadoes.

I don't expect this to settle any of these questions once-and-for-all — I don't expect anything at all will do that. There are too many problems (who decides what goes in the data set or how it's cleaned or scaled or lagged?). But I think doing things double-blind like this would create a lot more confidence in econometric-style results. In a way it even lessens the data-trawling problem by stepping into the issue head-on: no more doubting how much the researchers just went fishing for any correlation they could find, because we know that's exactly what they did, so we can be fully skeptical of their results.

Posted in Business / Economics, CS / Science / Tech / Coding | Tagged , , | Leave a comment