If you had closely followed the counting of the assembly elections held in November-December, you might have noticed the fluctuating fortunes in Chhattisgarh. At 11am, by when leads for all 90 constituencies were in, the Congress was leading in 50 seats and the Bharatiya Janata Party (BJP) in 38. By noon, it was 43-43. The BJP then pulled ahead marginally (it was 47-43 in favour of the BJP at 1pm), but at 4pm it was back to a close contest at 44-43.

Finally, when all the votes had been counted, the BJP ended up with 49 seats, a comfortable majority. The Congress ended with 39 seats.

A similar thing happened in Madhya Pradesh, where the Congress led in 71 seats at noon but ended up with only 58 seats, and in Rajasthan, where the Congress initially led in 32 seats but ended at 21. Thus, in each of these three states, early leads turned out to be a poor predictor of the final vote tally.

The question we seek to answer in this edition of Election Metrics is why such fluctuations happen, and under what circumstances are early leads good predictors of final seat tallies.

First up, let us look at the margins of victory in the four big states that went to polls. For each state, we will look at the proportion of the seats that were decided by a margin of less than 1% and of less than 2%. This should give us an idea of how close the contests were in each state.

From the table, we will notice that Delhi (where leads didn’t swing as much as in the other three states) had more close contests than the other three states. So why is it that despite not having as many close contests, the three states gave inconsistent results through the day of counting?

Let us examine this using an analogy. Suppose you have 100 brinjals, 60 of which are purple and 40 are green. Let us say that you place these brinjals in 10 piles of 10 brinjals each. What is the colour distribution of each pile? At one extreme (call it Case 1), you can have six piles that contain only purple brinjals and four piles with only green brinjals. At the other extreme (Case 2), each of the 10 piles will have six purple and four green brinjals. Most of the time, we will be somewhere between these two cases.

Now, let us say you start counting the brinjals pile by pile, and you use that to estimate the total number of purple and green brinjals. How well you can predict and how much your predictions fluctuate depend on how the brinjals of different colours are distributed among the 10 piles. If they are distributed as in Case 2, your forecast of the colour distribution of all the brinjals put together will be accurate from the first pile onwards. As you go through the piles, your estimate will also be consistent (since each pile has the same mix of green and purple).

If the brinjals are distributed as in Case 1, however, you might notice that you are going to be in trouble. If you look at an all-green pile first, you will declare that green is the winner by a landslide. And next, when you see a purple pile, you declare it is a close contest. Notice that when brinjals are distributed like this, it is impossible for you to make an accurate estimate of the final distribution until all the piles have been counted. Also notice that your estimate will fluctuate wildly.

If the “purples" and “greens" (continuing with the brinjal analogy) are evenly distributed across a constituency, the leads in the constituency are likely to be consistent across rounds. If the constituency is segregated, however (as in the Case 1 of brinjals), with all purples in some parts and all the greens in others, the leads are likely to swing wildly and early leads are not good predictors of final results.

So how can we predict the final results in a constituency based on early results? In order to answer this, we will need to know the history and geography of a constituency, and look at how politically segregated it is. If we assume that voting preferences of large populations don’t change dramatically, then looking at whether early leads were good predictors of final results in a constituency in earlier elections can tell us whether the constituency is closer to Case 1 or Case 2. If it is closer to Case 2, then we can extrapolate early leads to predict final results. If the constituency has traditionally been a Case 1 constituency, however, it is impossible to predict based on early leads.

Now you may ask why it is impossible to predict based on early leads, which reveal preferences of nearly 10% of the constituency’s population, while it is possible to predict based on an opinion poll with a much smaller sample. The answer to this lies in randomness. In a well-designed opinion poll, the respondents are chosen at random. If you were to pick up 10 brinjals at random (with your eyes closed) it is highly likely that the colour distribution of the brinjals you have picked up closely mirrors the colour distribution of the population of brinjals. However, if you are given a pile of brinjals (which you are not allowed to choose), its composition may not mirror that of the population. This is exactly the case with counting rounds, since ballots that are put together in each round of counting are not chosen at random.

To summarize, that the leads for Chhattisgarh swung wildly shows us that a number of constituencies in that state are geographically polarized. It also indicates a high level of geographical group voting (where a large number of people in a geographical area vote similarly). Given that the lead situation was broadly consistent in Delhi, it indicates that in Delhi political vote banks are not geographically contiguous.

My Reads Logout