Last Updated : 28 Jul, 2025
Understanding customer preferences and restaurant trends is important for making informed business decisions in food industry. In this article, we will analyze Zomato’s restaurant dataset using Python to find meaningful insights. We aim to answer questions such as:
Below steps are followed for its implementation.
Step 1: Importing necessary Python libraries.We will be using Pandas, Numpy, Matplotlib and Seaborn libraries.
Python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Step 2: Creating the data frame.
You can download the dataset from here.
Python
dataframe = pd.read_csv("/content/Zomato-data-.csv")
print(dataframe.head())
Output:
Dataset Step 3: Data Cleaning and PreparationBefore moving further we need to clean and process the data.
1. Convert the rate column to a float by removing denominator characters.
def handleRate(value):
value=str(value).split('/')
value=value[0];
return float(value)
dataframe['rate']=dataframe['rate'].apply(handleRate)
print(dataframe.head())
Output:
Converting rate column to float2. Getting summary of the dataframe use df.info().
Python
Output:
Summary of dataset3. Checking for missing or null values to identify any data gaps.
Python
print(dataframe.isnull().sum())
Output:
null valuesThere is no NULL value in dataframe.
Step 4: Exploring Restaurant Types1. Let's see the listed_in (type) column to identify popular restaurant categories.
Python
sns.countplot(x=dataframe['listed_in(type)'])
plt.xlabel("Type of restaurant")
Output:
Conclusion: The majority of the restaurants fall into the dining category.
2. Votes by Restaurant Type
Here we get the count of votes for each category.
Python
grouped_data = dataframe.groupby('listed_in(type)')['votes'].sum()
result = pd.DataFrame({'votes': grouped_data})
plt.plot(result, c='green', marker='o')
plt.xlabel('Type of restaurant')
plt.ylabel('Votes')
Output:
Step 5: Identify the Most Voted RestaurantConclusion: Dining restaurants are preferred by a larger number of individuals.
Find the restaurant with the highest number of votes.
Python
max_votes = dataframe['votes'].max()
restaurant_with_max_votes = dataframe.loc[dataframe['votes'] == max_votes, 'name']
print('Restaurant(s) with the maximum votes:')
print(restaurant_with_max_votes)
Output:
Highest number of votes Step 6: Online Order AvailabilityExploring the online_order column to see how many restaurants accept online orders.
Python
sns.countplot(x=dataframe['online_order'])
Output:
Step 7: Analyze RatingsConclusion: This suggests that a majority of the restaurants do not accept online orders.
Checking the distribution of ratings from the rate column.
Python
plt.hist(dataframe['rate'],bins=5)
plt.title('Ratings Distribution')
plt.show()
Output:
Step 8: Approximate Cost for CouplesConclusion: The majority of restaurants received ratings ranging from 3.5 to 4.
Analyze the approx_cost(for two people) column to find the preferred price range.
Python
couple_data=dataframe['approx_cost(for two people)']
sns.countplot(x=couple_data)
Output:
Step 9: Ratings Comparison - Online vs Offline OrdersConclusion: The majority of couples prefer restaurants with an approximate cost of 300 rupees.
Compare ratings between restaurants that accept online orders and those that don't.
Python
plt.figure(figsize = (6,6))
sns.boxplot(x = 'online_order', y = 'rate', data = dataframe)
Output:
Step 10: Order Mode Preferences by Restaurant TypeConclusion: Offline orders received lower ratings in comparison to online orders which obtained excellent ratings.
Find the relationship between order mode (online_order) and restaurant type (listed_in(type)).
pivot_table = dataframe.pivot_table(index='listed_in(type)', columns='online_order', aggfunc='size', fill_value=0)
sns.heatmap(pivot_table, annot=True, cmap='YlGnBu', fmt='d')
plt.title('Heatmap')
plt.xlabel('Online Order')
plt.ylabel('Listed In (Type)')
plt.show()
Output:
With this we can say that dining restaurants primarily accept offline orders whereas cafes primarily receive online orders. This suggests that clients prefer to place orders in person at restaurants but prefer online ordering at cafes.
You can download the source code from here: Zomato Data Analysis
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4