Model Wars: Data Science and Politics
The last few days have brought something unusual to the news. The major news channels are talking about a battle of the data scientists. Yep, that’s right—data science people fighting about the validity of models for polling predictions.
I find this very unusual. I work in a world with a lot of data scientists. In fact, the team at my own company, Eleventy, builds models to compete against each other in prediction “bake-offs.” Each team member has unique ways of building models—and loves to put one model against the other—but there is never an argument about what is empirically correct and valid.
This weekend, an all-out war started between The Huffington Post and Nate Silver of FiveThirtyEight. Ryan Grimm, a Huffington Post writer, basically implied Silver was not creating valid, empirically correct models. My favorite headline about this was from Mashable: “There Is a 100 Percent Chance That Nate Silver is F**king Furious.”
I'm not here to debate the politics of all this, or which candidate is predicted to win or by how much. I am here to talk about the data-science aspect. That said, I am a pretty big Nate Silver fan, but only because I believe he is a seasoned modeler. It seems the major issue here relates to Nate Silver making adjustments to what the data says.
Let me be clear, modeling is both science and art.
Modeling is used a lot in nonprofit marketing. The best models are not plug and play. And they're not just about lining up data and reporting it out. If that was the case, wouldn’t that mean there was no human element to the approach? This is where the art of modeling lives.
Custom models are the best-performing models for predicting performance. Grimm's primary issue with Silver is that Grimm seems to believe the art aspect should not happen.
Said Grimm in his article:
The short version is that Silver is changing the results of polls to fit where he thinks the polls truly are, rather than simply entering the poll numbers into his model and crunching them. Silver calls this unskewing a “trend line adjustment.” He compares a poll to previous polls conducted by the same polling firm, makes a series of assumptions, runs a regression analysis, and gets a new poll number. That’s the number he sticks in his model―not the original number.
Here’s my issue: Grimm implies that there is no professional element, no element of expertise, no human thought or assumptions necessary to build a model. I don’t care if it is a poll prediction, a lapsed-donor prediction or a widget-selling prediction—to imply that it is just about the plug-and-play data goes against everything I believe to be true in the world of marketing analysis and prediction.
“The reality of modeling is that numbers in and of themselves are dumb," Jim Moran, vice president of marketing intelligence and analytics at Eleventy, told me. "It is the context in which they are used, which is informed through experience and a history of observation, that makes them smart. This is why successful models require science and art.”
The process of modeling at a custom level requires development of assumptions, assessment of errors and ultimate adjustments based on them. If you think modeling is any different, you don’t understand modeling. The real sweet spot exists where the art and science are working together.
Don’t believe me? Consider this: Application Development Trends magazine published an article earlier this year on how "data scientist" was the top job in America for 2016, according to Glassdoor.
Here are some interesting lines from that article:
- “A successful data scientist is curious, creative, a skilled technologist and a clear communicator.”
- “It's so hard finding these elusive creatures that many enterprise Big Data initiatives are being held back, depriving companies of what could be a crucial competitive advantage.”
- "A data scientist spends most of his or her working day communicating. Only about 10 percent of the time is spent on what most people think of when they think 'data science'—the model selection and training. Data scientists will spend lots of time tuning their opinions, their stories and their models to match reality."
You tell me, but seems like the art part of this is pretty clear. I’m with you, Nate Silver.
- Categories:
- Analytics
- Data Mining
Vice President, Strategy & Development
Eleventy Marketing Group
Angie is ridiculously passionate about EVERYTHING she’s involved in — including the future and success of our nonprofit industry.
Angie is a senior exec with 25 years of experience in direct and relationship marketing. She is a C-suite consultant with experience over the years at both nonprofits and agencies. She currently leads strategy and development for marketing intelligence agency Eleventy Marketing Group. Previously she has worked at the innovative startup DonorVoice and as general manager of Merkle’s Nonprofit Group, as well as serving as that firm’s CRM officer charged with driving change within the industry. She also spent more 14 years leading the marketing, fundraising and CRM areas for two nationwide charities, The Arthritis Foundation and the American Cancer Society. Angie is a thought leader in the industry and is frequent speaker at events, and author of articles and whitepapers on the nonprofit industry. She also has received recognition for innovation and influence over the years.