This project uses the Algerian Forest Fires dataset from the UCI Machine Learning Repository to predict fire risk using linear regression. The dataset includes temperature, humidity, wind, rain, and Fire Weather Index (FWI) components recorded across two Algerian regions. The project includes EDA, feature engineering, model training, and evaluation using R² and RMSE.
The goal is to develop and evaluate regression models that can estimate the Fire Weather Index (FWI) using climate-related and calculated features. This includes:
- Data preprocessing and cleaning
- Feature distribution analysis and outlier handling
- Correlation and multicollinearity checks
- Application of Linear Regression and its regularized variants (Ridge, Lasso)
- Cross-validation and hyperparameter tuning
- Model evaluation using multiple metrics
- Saving the best model using pickle for future inference
The dataset includes daily weather observations and FWI components from two regions in Algeria (Bejaia and Sidi Bel-abbes), collected between June and September 2012. It contains 244 instances and 12 attributes, including temperature, wind speed, humidity, and indices from the Canadian FWI system.
- Python 3.8+
- pandas
- numpy
- scikit-learn
- matplotlib
- seaborn
- statsmodels
To install dependencies:
pip install -r requirements.txt- Applied log transformation to skewed features
- Analyzed and preserved outliers for interpretability
- Scaled all numerical features using standardization
- Plotted histograms and boxplots for numerical columns
- Used correlation matrix to detect feature interactions
- Removed highly correlated features (threshold > 0.90) for Linear Regression
- Trained baseline Linear Regression
- Applied Ridge and Lasso with:
RidgeCVandLassoCVfor quick alpha selectionGridSearchCVfor advanced hyperparameter tuning
- Evaluation metrics used:
MAE,RMSE,R² - Ridge (tuned via
GridSearchCV) gave the best overall performance - Included
MAEas a key metric due to the presence of outliers
- Best model (
Tuned Ridge) saved as a.pklfile usingpickle
Abid, . (2019). Algerian Forest Fires [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5KW4N.