Batch Prediction allows you to efficiently generate predictions for an entire dataset using one of your trained models. Instead of inputting data row by row, you select an existing dataset, and the system processes all records in the background, making it ideal for large-scale scoring tasks.
Use Batch Prediction when you need to score a customer base, evaluate the potential outcome for a list of leads, apply a model to historical data for analysis, or any scenario requiring predictions on multiple records simultaneously. The results are stored and can be reviewed later.
Automatic summary generation for prediction batches
Similar to Real-Time Predictions, you access the Batch Prediction feature from a specific model's detail page:
This will reveal the components needed for selecting a dataset and managing batch prediction jobs.
When Batch Prediction mode is active, the interface typically includes these components:
This area allows you to choose the input data and start the process:
dataset.preprocessed
).This section (Ref: `HistoricalBatchPredictions.vue`) tracks the status of your batch jobs:
Opened by clicking a card in the historical lists, this modal (Ref: `BatchPredictionDetails.vue`) provides comprehensive information about a specific batch run.
It includes summary statistics, input data overview, and paginated individual prediction results. (See Viewing Batch Results section below for more details).
Initiating a batch prediction job involves these steps:
Verify you are in the batch prediction view. If necessary, click the Batch Prediction button.
Use the dataset selection component (`DatasetUploadComponent` provided by the parent `PredictionInterface.vue`) to choose the dataset containing the records you want to predict on. The selected dataset's name will appear.
Click the Predict button within the "Batch Prediction" section (Ref: `BatchPrediction.vue`). This sends a request to the backend (e.g., POST /batch_predict/{dataset.id}
) with the model_id
to start processing the selected dataset.
The batch job runs in the background. You can monitor its progress in the Ongoing Batch Predictions list. Once finished, it will move to the Completed Batch Predictions list.
The `HistoricalBatchPredictions` component provides visibility into your running and completed jobs:
Shows jobs currently in progress. Their status will typically be "ongoing".
Shows finished jobs. Their status will be "completed" or "failed".
The component listens for WebSocket events (batch_prediction_update
on the /batch_predictions
namespace) to automatically update the status, average output, and potentially move jobs between lists without requiring a page refresh.
Clicking anywhere on a batch card (ongoing or completed) will trigger the opening of the details modal for that specific `batchId`.
Clicking on a batch card in the historical lists opens a detailed modal view (`BatchPredictionDetails.vue`) fetched via /batch-predictions/details/{batchPredictionId}
:
Provides a summary of the batch run:
Displays summary statistics (like mean, min, max, count) for the input features used in the batch prediction, provided by the `InputDataSummary` component.
Since batch jobs can process thousands or millions of rows, the individual results are paginated:
May include a section (`batchPrediction.details`) with supplementary information or logs related to the batch run, displayed in a preformatted block.
Here's a quick reference for some specific fields you'll encounter:
Field | Description |
---|---|
Status |
|
Average Output | The arithmetic mean of all numerical prediction outputs in the batch. Primarily useful for regression models. May be 'N/A' if not applicable or during processing. |
Mean Probability | For classification models, this shows the average probability calculated for each class label across all predictions in the batch. Useful for understanding overall model confidence on the dataset. Format: `'ClassName1': 0.75, 'ClassName2': 0.25`. |
After running batch predictions, you might explore: