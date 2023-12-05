Zerodha founder Nikhil Kamath, on Tuesday, posted a note on social media platform X, in which he apologized for the inconvenience caused to its users due to consecutive tech issues that happened on November 6 and December 4. {{^adFree}} {{/adFree}}

“In the business updates post I shared in August this year, I mentioned how we hadn’t had any large tech issues for a couple of years. Unfortunately, we have had two episodes in quick succession in the last two months, affecting between 5 and 20% of our active customers," Kamath said in his post on X.

As a broker, we have multiple external dependencies. To name a few:

Exchanges and depositories

Data centres: physical & cloud

Leased lines for connectivity between exchanges and data centres

Our Execution Management System (EMS) vendor

Cloudflare for SSL The issues on Nov 6th and Dec 4th were triggered due to edge cases with our external dependencies. This is no excuse, and I understand that, as a platform, we are responsible for all the issues you face. But I wanted to share with you what went wrong and what we are doing about it.

The Nov 6th issue was due to an unscheduled update in the anti-malware monitoring service from our EMS vendor, which started throttling our servers. You can check the detailed RCA here. {{^adFree}} {{/adFree}}

Yesterday’s issue seems to be because of an exponentially larger number of customer password reset requests that caused login issues. On Monday morning, the system that notifies users of logins from new geographical locations based on IP addresses sent out an unexpectedly large number of alerts. We discovered that this was the result of an increase in the geo-location accuracy of the IP/geo-location database that we use. A routine update of this database happened over the weekend. We believe this led to a large influx of password reset requests from confused users, putting a strain on our login systems and resulting in login failures. We will share a detailed RCA on the disclosure page soon.

We now have put in place fixes to ensure these types of cases don’t affect our platform in the future. While we continuously put a lot of effort into ensuring all types of scenarios are factored in proactively, it is impossible for any technology platform to cover all edge cases. A large number of users attempting to reset their passwords due to a notification triggered by an increase in IP geolocation accuracy was one of those scenarios. Please be assured that, as a team, we are doing whatever we can to ensure the platform’s stability.

Social media has been a double-edged sword for us. We have significantly benefited from it since it made interacting with our customers easier and helped spread the word about our business. But the flip side is that we get a disproportionate number of tweets on a day like yesterday compared to the issue. On a normal day, we receive about 3500 to 4000 customer tickets, and about 10 lakh users trade daily. Yesterday, we had about 7700 tickets, an increase of about 4000. We believe that ~10% of active users faced disruptions intermittently for about 30 minutes. I am in no way saying this to minimise the issues traders faced, but just to highlight that we have a younger and more vocal online audience. Given that the markets moved significantly, there was a lot of interest from non-transacting users logging in to check their portfolios, making the disruption seem much larger than otherwise. {{^adFree}} {{/adFree}}

Once again, I want to assure you that the stability and reliability of the platform are our top priorities. We are working hard towards that in whatever way possible. We are extremely sorry for the inconvenience to those who were affected. If there have been any losses due to the incidents, create a ticket, and our team will try to get back to you as soon as possible with the best way to resolve them.

Sorry again,

