Most election forecasters in India agree that the trickiest part of forecasting an election is converting vote shares (obtained through a survey) to seats.

Given that we follow a first past the post system, sometimes you can win an election with only 21% of the popular vote (Buxar, Bihar in 2009), while at other times even 49.3% of the vote is not enough to win (Anand, Gujarat in 1999). Hence, how would you interpret the finding that a particular party is expected to get, say, 33% of the vote in a particular state that normally sees a four-cornered contest?

In this edition of Election Metrics, we will use data from the last four parliamentary elections to get insights on how we can convert vote shares into seats.

We will start with comparing the votes required for victory with the number of candidates in each constituency. It is not prudent to look at the total number of candidates, for that can include candidates that get only a handful of votes.

In an earlier edition of Election Metrics , where we looked at corners of contests, we used the number of candidates who retained their deposits (or got over one-sixth of the total votes polled).

For this analysis, however, we will use a metric called effective number of partiesâ€” votes (ENPV). This is derived from the vote shares of each candidate in the contest. For the more technically minded reader, if V1,V2,â€¦Vn are the vote shares of different candidates for a particular seat

To explain in English, we take the vote shares of each candidate in a constituency (represented as a decimal), square them and add them up. The reciprocal of this gives the effective number of parties.

For example, if you have two candidates who get 51% and 49% of the votes respectively. The ENPV is given by

which equals 1.99. If the top two candidates get 41% and 39% of the votes, and there are 10 other candidates who get 2% of the vote each, the ENPV comes out to be 3.08 (using the above formula). It is as if the 10 candidates who got only 2% of the vote each were equivalent to one candidate.

We will start off with some naÃ¯ve analysis. Let us simply compare the ENPV of a particular constituency with the vote share of the winning candidate. The hypothesis is that you can win with a lower score in constituencies with a larger number of effective candidates. The scatter plot in figure 1 (on the X-axis we have the number of candidates in a constituency as given by ENPV, on the Y-axis we have the vote share of the winner) helps us test this. Here, we see a clear trend between the number of candidates and vote share of the winner.

The above math is not robust, however. The more mathematically minded readers might notice that the vote share of the winner is an input in calculating ENPV. To get around this, let us compare the number of candidates with the vote share of the runner up. How does this help? What we want is to compare the number of candidates to the votes required to win, and all the winner requires is one vote more than the runner up! We see that in figure 2 here, which is similar to figure 1, except for the fact that the Y-axis has the vote share of the runner up.

The relationship here is less clear, but the red regression lines show us that the vote share required to win decreases as the number of candidates increases. The question, however, is how do we predict seats based on vote share projections from a survey?

While different states have different number of significant political parties, we can assume that the number of significant political parties within a state is constant across a state (we can test this by regressing the ENPV across constituencies in a particular election with state as the explanatory variable. The R squared of the regression tells us how much of the variance in ENPV is explained by State. From the regression we find that for each of the last four elections, this R square is around 50%, which is high eno

What we will do is to calculate the median ENPV by state (median and not mean because we donâ€™t want the odd constituency with a large number of candidates to mess up the stateâ€™s average), and then round it to the nearest integer. This way, states can be classified by the number of corners of contests. Then, for each state and each political party, we will calculate the vote share and the seat share of the party in the state. Since we need meaningful numbers for seat share of a party, we will only take into consideration states with 10 or more Lok Sabha seats.

Firstly, we will look at the corners of contests in each state in the last four elections (figure 3).

Next, for each state with a two-cornered contest, we will look at the vote share and seat share of the party in the state in that particular election, and plot them in a scatter plot. We will do the same for states with three and four cornered contests.

What should strike you from this figure is the randomness, which is a consequence of our first past the post system. For example, in a two-cornered contest in Andhra Pradesh in 1999, the Congress got 43% of the vote, but ended up with only 12% of the seats. As the number of corners of contests increases, however, the randomness reduces, thus increasing the predictive power of the model.

An opinion poll conducted by CNN-IBN and CSDS whose results were published last week predicted that in Uttar Pradesh, the Bharatiya Janata Party is likely to get 38% of the vote. The survey reported that this will translate to about 41-49 seats for the BJP. What does our model above say?

The survey predicted 17% of votes for both the Samajwadi Party and the Bahujan Samaj Party, 16% for the Congress and 5% for the Aam Aadmi Party. Based on the ENPV formula, this tells us that Uttar Pradesh is a four-party state. What does our model above say for a party getting 38% of the votes in a four-cornered contest? If you look at the graph for the four-cornered contest closely (figure 4), you will notice that 38% vote share literally falls off the chart. Only once before has a party secured over 30% of the vote in a four-cornered contest (Congress in relatively tiny Haryana in 2004, with 42%) and on that occasion went on to get 90% of the seats (nine out of 10).

Given that this number (38%) falls outside the range we have noticed historically for a four-cornered contest, it makes it unpredictable. What we can say, however, is that if a party can manage to get 38% of the votes in a four-cornered state such as Uttar Pradesh, it will go on to win a lot of seats.

Just to put this in perspective, in 1998, the BJP got 36% of the popular vote in a three-cornered contest in the then undivided Uttar Pradesh, which resulted in victory in 57 out of 85 seats in the state.