The Beat Report: How I measured the sleeping patterns of 450,000 Indians

Cleaning up the kitchen is a common pre-bedtime activity for many women even as more men look at the TV or phone screen. The story was accompanied by this powerful illustration by my colleague, Tarun Kumar Sahu. Our design team deliberated the page layout over several days, with the goal to bring the story alive. (Tarun Kumar Sahu/Mint)
Cleaning up the kitchen is a common pre-bedtime activity for many women even as more men look at the TV or phone screen. The story was accompanied by this powerful illustration by my colleague, Tarun Kumar Sahu. Our design team deliberated the page layout over several days, with the goal to bring the story alive. (Tarun Kumar Sahu/Mint)
Summary

In this edition of The Beat Report, Mint's Tanay Sukumar describes how he navigated a data maze to produce a massive analysis that highlights the acute gender gap in sleeping habits in India.

In The Beat Report newsletter, Mint's journalists bring you unique perspectives on their beats, breaking down new trends and developments, and sharing behind-the-scenes stories from their reporting.

Good morning!

One day in early June, my boss asked me as I sat across his desk right in the middle of the afternoon, “You look sleep-deprived. What happened?" Now, that’s not a question that usually comes my way since I rarely mess around with my sleep. But the irony struck me: for 10 days straight, I had indeed been working overtime—and I had been working on a story about sleep itself.

The irony of losing sleep… over sleep

For a good part of the three months that have passed since then, a childhood Bollywood song had made a haunting return in my head: “Neend churayi meri, kisne o sanam" (“Who stole my sleep, darling?"). Cringe as the lyrics might sound now, this song added some music to my long-drawn effort on a special data story that we published in Mint last week. The story, for the first time ever, reliably quantifies India’s sleeping habits: not just how much we sleep, but also when we go to bed and wake up, what we do before bedtime, and how often our sleep gets disrupted and by what.

I didn’t set out to make any eye-popping revelations, but by the end, I think I did. You know what was unmissable? The gendered nature of sleep, or let’s call it, the sleep gender gap. Whatever story angle I innocuously touched, the gender impact startled me: nothing could have prepared me for the gulf of rest time that I found between men and women (especially in certain age groups).

And today, if you look at social media and WhatsApp groups discussing this story for the past week, you’ll see India’s women feeling vindicated as the story of their lives finally gets proven through numbers—with no one to dismiss their experience.

Outside of the world of heavy data users, the process of making such findings is little understood—because that’s the point, right, that we make it accessible to you in simple terms? But this story was the result of a weeks-long analysis, and here’s the story of how I did it (without making it all math-y, I promise!).

India finally gets a bedtime diary

From the "early to bed, early to rise" proverb to enduring mentions in the Ramayana (Kumbhakarna) and Macbeth (“the murder of sleep"), and from the siesta (Spain) to the inemuri (Japan), sleep is an integral part of our lives from childhood. Eight-hour sleep has attained the status of an apple in the world of medical must-dos. Yet, as I found while doing this, there’s little credible socio-cultural research on sleep patterns in India.

Thanks to India’s national time-use survey, first held in 2019 and then in 2024, it’s now possible. The survey captures a 24-hour activity log of all respondents. Since the statistics ministry does it, it has the best possible sample, representative and whatnot. Once the data is out, it’s up to researchers, journalists and the civil society to think of exciting ideas of using it to extract rich insights.

I used the 2024 survey. Media reporting around it has looked at the differential gender roles in households: women spend an eternity in household work; men stay out of it. But when I noticed in the survey’s official report that sleep takes up over eight hours of the day on average—one-third of our lifetime—I wanted to check if it’s the same for all groups.

Of course, it wasn’t; so my idea didn’t stop there, so that I didn’t end up stating the obvious.

For such surveys, the government does release a detailed report with massive tables. But it’s impossible to cover everything in even a 1,000-word PDF. Much more can be discovered using analysis of raw “microdata"—a large dataset that stores what each respondent said to each and every single question in the survey.

Analysing that raw data can get you numbers for any group you want. How much do Delhiites or Mumbaikars or residents of a small town in Kerala sleep? How much time do those aged 20-25 years spend gardening? How much do married Assamese women in the 40-48 age group and having two children read as compared to unmarried women in Goa? You get the drift. You name it, you have it (not so easy though!).

A real example: the main report may give you a particular metric for age groups 6-14, 15-59 and 60+, but the raw data can help us with an age-wise progression for any age group. The following striking chart I made is a result of that: see how the sleep gender gap widens in one’s 30s!

Behind the charts: the grind of sleep math

The raw data could fetch me gold, but it can take hours and days, which isn’t possible (and may even be unwelcome) in busy newsrooms trying to bring you news of the day. I needed to extract and analyse 10.2 million rows of data, often working over weekends and after I was done with my priority tasks of the day.

Okay, I’ll briefly get into the geek zone, so humour me for a bit (or jump straight to the next segment, or even better, the full story!). I’ve been a data journalist for over five years, but I hadn’t attempted anything like this yet. I don’t know how to code, but with generative AI, that’s no longer a limitation. I tried “vibe coding"; genAI can write the code, but you do need to plan, replan and rewrite the full algorithm for days, and vet the data that you get—a massive task.

My final cleaned prompts were around 4,000 words, but that was just a fraction of all the novellas I wrote and rewrote for days together in the course of getting it right. The data had one row for each activity performed by each respondent (454,192 of them) over 24 hours; that’s how it became 10.2 million rows (for context, a Microsoft Excel sheet can work with a maximum of 1.05 million rows). To work on my ideas, I needed convoluted algorithms (made extra challenging by the cyclical nature of time): identify each person’s all sleep episodes, their start and end times (i.e. bedtime and wake-up time), as well as the activity preceding them, and also identify sleep disruptions.

A snapshot of the data from RStudio: this is one person's partial activity log, from 4am to 4.30pm, with the V36 column denoting various activities (each activity has a three-digit code, with 911 being night-time sleeping.)
View Full Image
A snapshot of the data from RStudio: this is one person's partial activity log, from 4am to 4.30pm, with the V36 column denoting various activities (each activity has a three-digit code, with 911 being night-time sleeping.)

I used Google Docs to plan and write my algorithms (my biggest time-guzzler), Gemini to get the R code written, R to extract what I needed as a spreadsheet, and Excel for the final analysis. Gemini was surprisingly accurate in its coding.

Here’s one example of the data challenges I faced. Since the dataset spanned 4am to 4am, for most people, it had a single stretch of sleep (e.g. 10pm-6 am) listed as two separate rows (i.e. the first activity of the day, 4-6am, and the last, 10pm-4am), which I had to collapse into one row. Until, at one stage around day 7, I realised more possibilities I needed to account for: a person could be going to bed even at 4am or 4.30am (I first assumed they were asleep at that time), or some could sleep the full 24 hours (which due to the flaw in my method, was showing up as zero), presumably because they were sick; there were also those who didn’t sleep at all!

Here's a step-by-step breakdown of how I went about it:

1. Understand the data structure and plan what is to be done, with in-built verification steps at various stages (the first draft went into almost 35 detailed steps; by the end it was over 50).

2. Convert my plan to crystal clear prompts for Gemini and ask it to write the R code.

3. Run the script in R and perform multiple data tests to check the accuracy of the results. Identify whether my prompt wasn't clear enough or Gemini got it wrong; then identify respondents whose data was causing inaccuracies, and account for new possibilities that I hadn't imagined.

4. Rectify the plan, and repeat the above steps until the extracted data passed all tests.

5. Move the data into Excel and analyse for each story angle—this took me another fortnight, replicating the full analysis using three distinct approaches to ensure I got the same numbers every time.

6. Research, plan the story flow and charts.

7. Write it.

A special discovery for math geeks: Ever wondered how to get the average bedtime for a group of people? If two people sleep at 11pm and at 1am (23:00 and 01:00), is the average 12pm or 12am? It’s 12am, right? But both midnight and noon are exactly midway, depending on whether you see 1am as coming first in the day or 11pm. How do you decide which one to pick? Welcome to the cyclical nature of time, that puzzled me to no end. You may find it interesting how I did it finally (check the full methodology note in our full story). The methodology also has a step-by-step summary of my full algorithm.

The sleep gender gap is real — and wide

At the end, let me recount three of the discoveries that struck me the most. One, the sleep gender gap is the highest in the 30s. Two, cleaning up the kitchen is a common pre-bedtime activity for many women even as more men look at the TV or phone screen. Three, childcare is a big disruptor of sleep for women; it’s not so for men.

Of course, we should have known this all along (and most women have felt this); just that now, thanks to the Time Use Survey data, we have some undeniably robust evidence.

I hope you enjoy reading the full story, on which I spent many sleepless nights. Our brilliant design team also took several days to think of a page design that would do justice to the topic and the gravity of the findings. It’s a pretty long read (nearly 4,000 words), but I promise you, it will be worth your time this weekend. Do write back to share how it made you feel, and do share with friends and family.

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.
more

topics

Read Next Story footLogo