Popularized by Keith Poole and Howard Rosenthal, ideal point modeling is a powerful way to extract the relative ideologies of politicans based solely on their voting records. A lot has been written on ideal point models, so I’m not going to add anything new, but I wanted to give a brief overview of the Bayesian perspective.

First, some results. The following plot shows the ideal points (essentially inferred ideologies) of US senators based solely on roll call voting from 2013-2015 (scroll over the points to see names):

More extreme scores (i.e. away from zero) represent more extreme political views. While the liberal-conservative spectrum is not explicitly encoded into the model, the model picks this up naturally from voting patterns. On the far left are some of the most liberal members of the US Senate, such as Brian Schatz, while the far right has some of the most conservative members, such as Jim Risch and Ted Cruz. In the middle are senators sometimes referred to as DINOs and RINOs, such as Joe Manchin, Susan Collins, and Lisa Murkowski.

The basic model is as follows. Consider a legislator \(u\) and a particular bill \(d\). The vote \(u\) places on \(d\) is denoted as a binary variable, \(v_{ud} = 1\) for Yea and \(v_{ud} = 0\) for Nay. Each legislator has an *ideal point* \(x_u\); a value of 0 is political neutrality, whereas large values in either direction indicate more political extremism in the respective direction. Every bill has its own *discrimination* \(b_d\), which is on the same scale as the ideal points for legislators. If \(x_u*b_d\) is high, the legislator is likely to vote for the bill, and if the value is low, the legislator is less likely to vote. Finally, each bill also has an offset \(a_d\) that indicates how popular the bill is overall, regardless of political affiliation. Formally, the model is as follows:

where \(\sigma(\cdot)\) is some sigmoidal function, such as the inverse-logit or the standard normal CDF. If a senator didn’t vote on a particular bill, this data is considered missing at random.

Inference requires learning the vectors \(X, B\), and \(A\). I took a Bayesian approach and put (independent) normal priors on each variable. I then used an EM algorithm derived by Kosuke Imai et al. The E-Step and M-Step are described in full detail in the paper, and I followed their setup, except I removed senators with less than 50 votes, and I stopped after 500 iterations.

All my code is available here.