Thursday, 27 March 2025

DATASETS FOR CASE STUDIES - MS Excel, MS PowerBI, Python, R and other Analytical visualization tools.


📂 DATASETS FOR CASE STUDIES

Here are the datasets for the case studies and hands-on activities. These datasets are structured in text format for easy use with Power BI, Excel, and Python. You can connect this blog link to the application and access all the tables or copy and paste each table as required.





1️Sales Performance Analysis (Retail Chain)

Date

Store

Product

Category

Revenue

Profit

Units Sold

2024-01-05

Store A

Laptop

Electronics

85000

12000

5

2024-02-10

Store B

Mobile

Electronics

55000

8000

10

2024-03-15

Store A

Shoes

Fashion

12000

3000

20

2024-04-20

Store C

TV

Electronics

95000

15000

4

2024-05-25

Store B

Sofa

Furniture

67000

10000

3

📌 Columns: Date, Store, Product, Category, Revenue, Profit, Units Sold
🎯 Use Cases: Bar charts (store-wise sales), Line charts (sales trends), Tree maps (top products)


2️Employee Attrition (IT Company)

Employee ID

Age

Department

Salary

Satisfaction Score

Attrition

E101

25

IT

50000

3.5

Yes

E102

30

HR

60000

4.2

No

E103

27

Sales

55000

3.0

Yes

E104

35

IT

70000

4.8

No

E105

40

Marketing

65000

4.0

Yes

📌 Columns: Employee ID, Age, Department, Salary, Satisfaction Score, Attrition (Yes/No)
🎯 Use Cases: Pie charts (attrition rate), Scatter plots (salary vs attrition), Machine learning (predict attrition)


3️Customer Segmentation (Banking)

Customer ID

Age

Income

Credit Score

Spending Score

Account Balance

C201

24

30000

650

45

20000

C202

45

70000

780

80

100000

C203

33

45000

720

60

50000

C204

50

85000

800

90

150000

C205

29

40000

690

55

25000

📌 Columns: Customer ID, Age, Income, Credit Score, Spending Score, Account Balance
🎯 Use Cases: Clustering (K-Means in Python), Scatter plots (customer segmentation), Box plots (spending behavior)


4️Credit Card Fraud Detection

Transaction ID

Amount

Location

Customer ID

Fraudulent

T5001

5000

New York

C101

No

T5002

12000

London

C102

Yes

T5003

4500

Mumbai

C103

No

T5004

15000

Paris

C104

Yes

T5005

7000

Tokyo

C105

No

📌 Columns: Transaction ID, Amount, Location, Customer ID, Fraudulent (Yes/No)
🎯 Use Cases: Box plots (outlier detection), Decision Trees (fraud classification), Heat maps (fraud hotspots)


5️Student Performance Prediction

Student ID

Attendance (%)

Study Hours

Previous Grades

Final Score

S101

90

6

85

88

S102

75

4

70

72

S103

80

5

78

79

S104

95

7

90

92

S105

60

3

65

68

📌 Columns: Student ID, Attendance (%), Study Hours, Previous Grades, Final Score
🎯 Use Cases: Correlation analysis (attendance vs performance), Regression (predict final score)


6️Air Pollution Trends

Date

City

PM2.5 Level

Temperature

Health Cases Reported

2024-01-01

Delhi

180

15

120

2024-02-01

Beijing

220

10

200

2024-03-01

New York

90

18

50

2024-04-01

London

70

12

30

2024-05-01

Tokyo

110

20

75

📌 Columns: Date, City, PM2.5 Level, Temperature, Health Cases Reported
🎯 Use Cases: Time series forecasting (pollution trends), Heat maps (pollution hotspots)


7Waterfall Chart Dataset – This dataset represents a company's monthly profit & loss statement, showing revenue, expenses, and net profit/loss changes over time.

Category

Amount

Revenue

50000

COGS

-20000

Gross Profit

30000

Operating Expenses

-10000

Net Profit

20000


8 Standard Deviation Dataset – This dataset includes students' test scores across multiple subjects, allowing you to calculate the mean, variance, and standard deviation.

Student

Math

Science

English

Student 1

78

82

74

Student 2

85

79

80

Student 3

92

91

85

Student 4

88

87

78

Student 5

76

85

82

Student 6

81

90

88

Student 7

95

94

91

Student 8

89

83

76

Student 9

84

88

84

Student 10

91

86

79

📥 DOWNLOAD DATASETS 😊

  Please provide your requirements for datasets or chart type and we'll help you with the best possible solution.

No comments:

Post a Comment