Amazon’s open call for bids for its new headquarters, HQ2, closed last month, but in the months leading up to the final decision in 2018, analysts will continue to flood the internet with detailed studies evaluating who they believe should be the winner. In other words, the mirror-mirror-on-the-wall game for cities is just starting to warm up.
Earlier, ArchDaily reported on the data-driven approach adopted by Moody’s Analytics which projected Austin, TX as the winner. But another study by IT education company Thinkful now points towards Washington DC as the city most likely to make the cut. So what makes Washington DC the fairest of them all? Read on to see how data science techniques helped analysts at Thinkful with this prediction, what kind of approach they adopted, and how it differed from that of Moody’s Analytics.
Bearing in mind the requirements listed in Amazon’s Request for Proposals (RFP), both companies first short-listed cities based on Amazon's desire for a city with a population of over 1 million. But while the list by Moody’s had 65 candidates, Thinkful’s had 35, because the latter filtered according to both population, and proximity to an international airport, as per Amazon’s RFP.
From here onwards, it was pretty simple for Moody’s: looking at five basic categories (business environment, human capital, cost, quality of life, transportation), they rated each of their 65 cities on a scale of 1–5, calculated the average for each, and Austin, TX won with the highest average score.
Thinkful, however, used a slightly more complicated technique—a data science method known as “recommendation systems”—the very same method which Amazon uses to suggest products, or Netflix uses to suggest shows, to their users. Here’s how it worked: after finding existing data sets that could be correlated with 9 of the requirements listed in the Amazon RFP—for example, using U.S. News' "100 Best Places to Live in the USA" report to calculate quality of life for each city—Thinkful ended up with a total of 9 datasets. These were then standardized to make sure that they corresponded to the same scale, and the top score for each of the categories helped them calculate the score for a “perfect city” or “best possible scenario." Next, “similarity scores” were derived by comparing the “perfect” score and the real score for each of the 35 cities. And of course, the city closest to the “perfect” score, in this case, was "#obviouslyDC."
Data science can be confusing, but also fascinating, since there’s often no right or wrong when developing a method to arrive at a conclusion. In this example, we’ve seen two dissimilar methods applied to the same question, with each yielding different results, and both making sense. But it’s also worth noting that for HQ2, Amazon might have particular priorities within their RFP list which they haven't disclosed, making the math more complex! For now though, the debate around Amazon's HQ2 will continue, with cities (and data scientists) all over the country offering their analysis.
To read more about HQ2, check out our previous coverage: