Flipkart: Using Machine Learning to solve unique problems in Indian E-commerce

Home addresses in India pose a uniquely Indian problem- lack of standardization. This poses a challenge to e-commerce players whose success relies on efficiencies in last-mile logistics. This post talks about how Flipkart, an Indian e-commerce major is using Machine Learning(ML) to make sense of complex Indian addresses to iron out associated inefficiencies. In addition, we also look at other key areas of ML application for e-commerce companies.

Flipkart is the largest e-commerce marketplace in India, launched in 2007. Earlier this year, Walmart acquired a 77% stake in Flipkart for ~16B USD [1]. With a population of 1.3B people, Indian e-commerce market is expected to reach 200B USD by 2026 [2]. Lower internet penetration, lower credit card and e-payment penetration and poor last-mile connectivity, have resulted in innovations such as Cash on Delivery, No Cost EMI and easy returns.

E-commerce is a thin margin business, and at scale, reliance on manual operations hurts the business’s bottom line. In this context, Flipkart has been using Machine Learning(ML) to gain insights into consumer shopping habits, optimize prices of products, based on customer demand, while also applying those insights in areas such as logistics to better manage supply chain. [3]

Leveraging ML to address the Indian problem of addresses

In the absence of a formal geo-spatial classification, addresses in India are organized by Postal Index Number (PIN codes). However, each PIN code may correspond to an area of 50 sq. km. Factors like lower literacy rates lead to customers entering incorrect PIN codes, or commit spelling errors while filling up addresses, magnifying the problem. A lot of Indian place names are translated from local languages into English phonetically, resulting in variable spelling patterns. Each city has different labels like “blocks”, “mains”, “sectors”, or “phases” to designate street addresses with/without house numbers, house names or street names. New suburbs added into the delivery network may not necessarily adhere to traditional naming conventions. [4]

Source [4]

Source [4]
These issues impact a business whose delivery model hinges on the speed and efficiency of last-mile logistics. Delayed delivery erodes customer satisfaction, and improper addresses cause inefficiencies in the routing of delivery, placing a high load on the logistics network. This results in higher cost of logistics, transport and customer support, impacting the profitability. [4]

Currently, personnel at Flipkart’s delivery hubs use tacit knowledge in the process, identifying addresses intended for each delivery hub based on PIN code and location information. Last-mile delivery staff has strong familiarity with the addresses on their routes. Data Science team at Flipkart had to understand how this unique knowledge could be converted into a ML model. The team started with approximating a mental model employed by the sorting personnel to classify addresses correctly, and then built a ML model that would attempt to learn incrementally from these labels. [4]

The team started to work with field executives and supervisors at the hub to collect and validate data sets, and then started to label these addresses. Then the team built graphical ML models churning customer address records using both supervised and unsupervised models to learn names of cities, localities, sub-localities and building names that exist in a given region, along with their hierarchical relations and alternate spellings. By doing this, the company has built a model consisting of different locality features that people commonly write in addresses, which is gaining accuracy with incremental deliveries. This model will help Flipkart reduce dependency on manual labor to sort packages in its hubs. [4] [5]

Leveraging ML in other areas and Challenges

With the aim of using ML in other areas effectively, Flipkart last year unveiled its AI For India program to focus efforts on using AI and ML. With huge amounts of data already collected, and newer ones being added on a daily basis, Flipkart is experimenting use of ML in areas like catalog qualities, product size recommendations and preventing fraud orders. Its wholly owned subsidiary ‘Myntra’, a fashion only platform, is using ML to automatically select and create new designs based on analysis of sales trends. Flipkart has partnered with Microsoft Azure to leverage AI, machine learning and analytics capabilities in Azure, such as Cortana Intelligence Suite and Power BI, to optimize its data for innovative merchandising, advertising, marketing and customer service, and this will help the company over the years to come. [6]

One of the key challenges is the lack of talent in India. There are varying estimates around the number of data scientists and ML professionals in India, but the number is below 1000. In comparison, Amazon has 5000 people working on just the Alexa platform [7]. Flipkart should look to partner with leading universities in India to improve the curriculum to include key ML courses. There is also the option to build data science centers in areas like the Silicon Valley where talent is in plenty, though this may not scale.

With regards to the specific problem around addresses, one key open question is the scalability of these algorithms. Currently, the tier 1 metro cities (likes of New Delhi and Mumbai) account for ~60% of e-commerce demand. How should Flipkart think about the problem of scaling its algorithms to address the problem of addresses when many newer cities start to contribute more to the sales?

(Word Count 795)


1. Walmart News, “Walmart and Flipkart Announce Completion of Walmart Investment in Flipkart, India’s Leading Marketplace eCommerce Platform”, https://news.walmart.com/2018/08/18/walmart-and-flipkart-announce-completion-of-walmart-investment-in-flipkart-indias-leading-marketplace-ecommerce-platform , accessed Nov 2018
2. LiveMint, “India’s e-commerce market to hit $200 billion by 2026: Morgan Stanley report”, https://www.livemint.com/Industry/9iUxlQZ4iHwPiXRKscx3LK/Indias-ecommerce-market-to-grow-30-to-200-billion-by-202.html , accessed Nov 2018
3. Flipkart Group, “FLIPKART WANTS DATA SCIENTISTS AND ENGINEERS TO BUILD AI FOR INDIA: SACHIN BANSAL” , https://stories.flipkart.com/flipkart-ai-india-sachin-bansal/ , accessed Nov 2018
4. Flipkart Group, “WITH AI & ML, FLIPKART IS ADDRESSING THE UNIQUELY INDIAN PROBLEM OF PROBLEM ADDRESSES”, https://stories.flipkart.com/ai-ml-flipkart-indian-address/ , accesses Nov 2018
5. Medium, “Learning to Decode Unstructured Indian Addresses”, https://medium.com/@kabirrustogi/learning-to-decode-unstructured-indian-addresses-c80ffcda2e84 , accessed Nov 2018
6. TechEmergence, “Artificial Intelligence at India’s Top eCommerce Firms – Use Cases from Flipkart, Myntra, and Amazon India”, https://www.techemergence.com/artificial-intelligence-at-indias-top-ecommerce-firms-use-caes-from-flipkart-myntra-and-amazon-india/ , accessed Nov 2018
7. FactorDaily, “How machine thinking is transforming Flipkart”, https://factordaily.com/flipkart-ai-for-india/ , accessed Nov 2018


Bracing for impact: the additive effect of Invisalign on manufacturing


It’s all open at Wikimedia Foundation

Student comments on Flipkart: Using Machine Learning to solve unique problems in Indian E-commerce

  1. Great choice of topic and a compelling illustration of how ML can unlock significant value. If 60% of business is from Tier 1 Cities (for which algorithms are already being worked on), it seems the other 40% would represent enough volume (if not too fragmented across geographies) to develop more comparable algorithms to solve for this issue. I’d also expect there could be some learnings from the Tier 1 markets that are transferrable. Overall, I push back on the concern about scalability for the reasons above, particularly paired with the Moore’s Law-esque improvements in ML technology and growing popularity of ML educational we’re seeing.

Leave a comment