Sales forecasting is the most requested data science use case in business. In this complete tutorial, we build a forecasting model end-to-end with Python: from data exploration to production deployment.

The complete pipeline

A time series forecasting project always follows the same pipeline: exploration, preprocessing, model selection and training, evaluation, deployment and monitoring.

Forecasting pipeline
Complete pipeline: from raw data to production forecast
80%
of forecasting projects fail due to poor data preparation
Prophet
remains the best efficiency/complexity ratio in 2026
MAE, RMSE, MAPE
the 3 essential metrics to track

Step 1: exploration and diagnostics

Visualise your data first. Identify trend, seasonality (annual, weekly, daily), outliers and structural breaks. Test stationarity with the ADF test.

Step 2: choose the right model

Prophet (Meta): our default recommendation

Prophet is robust, easy to configure and natively handles missing data, holidays and trend changes. Our starting point for 80% of sales forecasting cases.

LightGBM with temporal features

For series with many explanatory variables (weather, price, promotions), gradient boosting with temporal features (lag variables, rolling averages) often outperforms classical models.

Anti-pattern to avoid: never train on future data. Train/test split on time series must respect chronological order. Use TimeSeriesSplit from scikit-learn.

Step 3: correct evaluation

MAPE for comparison between different volume products. MAE for absolute average error. RMSE to penalise large errors.

Production with confidence intervals

Always communicate confidence intervals in production, not just a point forecast. '1,200 units with 90% probability in range [950, 1,450]' is infinitely more useful than simply '1,200'.

Time Series Prophet Python Forecasting Retail Data Science

With care,

Sylvie Wendkuni NITIEMA
Founder & Data Scientist · DataSAI