Back to Portfolio

AQI Prediction & Forecasting

Machine Learning 2024
XGBoost LSTM ARIMA Streamlit Pandas
AQI Dashboard

About the Project

This project addresses environmental monitoring by predicting and forecasting the Air Quality Index (AQI) for over 203 cities.

It combines traditional machine learning models with time-series forecasting techniques to provide accurate, real-time pollution insights via an interactive dashboard.

Project Goals

  • To predict AQI with high accuracy using pollution and weather data.
  • To forecast future air quality trends using time-series analysis.
  • To visualize real-time data on an interactive map and dashboard.
  • To implement MLOps practices for model monitoring and retraining.

Technologies Used

  • XGBoost & Random Forest: For regression tasks and AQI prediction.
  • LSTM & ARIMA: For time-series forecasting of pollution levels.
  • Pandas & NumPy: For preprocessing 575k+ pollution and weather records.
  • Streamlit: For deploying the interactive user dashboard.
  • MLOps: Pipelines for model drift detection and automated retraining.

Approach

Data Engineering

  • Preprocessed over 5.75 lakh records.
  • Engineered fusion features combining pollution and weather data.
  • Performed statistical analysis to identify key drivers of AQI.

Modeling

  • Built Random Forest and XGBoost models for current AQI prediction.
  • Implemented ARIMA, GRU, and LSTM networks for forecasting future trends.

Deployment

  • Developed an interactive web dashboard using Streamlit.
  • Deployed the application with real-time data integration.

Results & Insights

  • Achieved 87% accuracy in AQI prediction.
  • Successfully scaled to cover data for 203 cities.
  • Real-time dashboard provides actionable insights for environmental monitoring.