Documentation > Predictions > Prediction API

Prediction API

The ML Clever Prediction API provides a powerful way to integrate predictions from your trained models directly into your own applications, services, or automated workflows. Instead of using the platform's UI, you can programmatically send input data and receive prediction results via a simple HTTP POST request.

This enables seamless integration for real-time scoring within web applications, automating decisions in backend processes, enriching data pipelines, or building custom tools that leverage your machine learning models. The API is designed for speed and scalability, utilizing caching mechanisms for optimal performance.

Getting Started: API Keys

To use the Prediction API, you first need an API Key. Each API key is linked to a specific **Model Deployment**. Deploying a model makes it accessible via the API and generates a unique key for authentication.

Obtain Your API Key

API Keys are generated when you create a Model Deployment within the ML Clever platform. You will need to navigate to the model you wish to use and create an active deployment for it.

Learn how to Create Model Deployments

Treat your API keys like passwords – keep them secure and do not expose them in client-side code or public repositories.

Making Predictions

Once you have an active deployment and its API key, you can make predictions by sending an HTTP POST request to the API endpoint.

Endpoint Details

Method: POST

URL: https://app.mlclever.com/api/predict/<key>

Content-Type: application/json

Request Body Structure

The body of your POST request must be a JSON object containing the following fields:

FieldTypeRequiredDescription
api_keyStringYesYour unique API key obtained from an active Model Deployment.
input_dataObjectYesA JSON object where keys are the exact feature names the model expects, and values are the corresponding input values. Data types must match those expected by the model (e.g., strings for text/categories, numbers for numerical features).
Ensure all required features for the deployed model are present in the input_data object. Missing features will result in an error (See Error Responses).

Example Request

Using cURL


curl -X POST https://your-ml-clever-instance.com/api/predict \
     -H "Content-Type: application/json" \
     -d '{
           "api_key": "YOUR_API_KEY",
           "input_data": {
             "feature1_name": "value1",
             "feature2_name": 123.45,
             "categorical_feature": "category_A"
           }
         }'
  

Using Python (requests library)


import requests
import json

api_url = "https://your-ml-clever-instance.com/api/predict"
api_key = "YOUR_API_KEY" # Keep your API key secure!

input_payload = {
    "api_key": api_key,
    "input_data": {
        "feature1_name": "value1",
        "feature2_name": 123.45,
        "categorical_feature": "category_A"
        # Add all required features for your model
    }
}

headers = {
    "Content-Type": "application/json"
}

try:
    response = requests.post(api_url, headers=headers, data=json.dumps(input_payload))
    response.raise_for_status() # Raises HTTPError for bad responses (4xx or 5xx)

    prediction_data = response.json()
    print("Prediction successful:")
    print(prediction_data)

except requests.exceptions.RequestException as e:
    print(f"API request failed: {e}")
    if response is not None:
        try:
            print(f"Error details: {response.json()}")
        except json.JSONDecodeError:
            print(f"Error details: {response.text}")

except Exception as e:
    print(f"An unexpected error occurred: {e}")
  

/ Understanding the Response

The API will respond with a JSON object indicating success or failure.

Success Response (HTTP Status 200 OK)

If the request is successful, the response body will contain:

FieldTypeDescription
predictionsListA list containing the prediction result(s). For classification, this might be the predicted class label. For regression, it's the predicted numerical value (rounded to 2 decimal places if numeric). Multi-output models may return multiple items.
confidence_intervalsList (Optional)For classification models, this may contain a list of probabilities corresponding to each class label, indicating the model's confidence.

{
  "predictions": [
    "PredictedClassA",
    0.85
  ],
  "confidence_intervals": [
    0.95,
    0.85,
    0.10
  ]
}
  

Error Responses (HTTP Status 4xx or 5xx)

If an error occurs, the response body will contain an error field describing the issue. Sometimes additional fields like missing_columns are included.


{
  "error": "Prediction failed due to missing input data",
  "missing_columns": [
    "feature3_name",
    "feature4_name"
  ]
}
  

{
  "error": "Invalid API key"
}
  

Common HTTP Status Codes:

  • 200 OK: Success.
  • 400 Bad Request: Invalid request format, missing required fields (like input_data), invalid input data types, or missing feature columns in input_data. The `error` message provides specifics.
  • 401 Unauthorized: Invalid or missing `api_key`.
  • 403 Forbidden: The API key is valid, but the associated Model Deployment is not `active`.
  • 404 Not Found: The model associated with the API key could not be found (rare, indicates an issue with the deployment).
  • 429 Too Many Requests: You have exceeded the rate limits.
  • 500 Internal Server Error: An unexpected error occurred on the server side. Check logs or contact support if persistent.

Rate Limiting

To ensure fair usage and stability, the Prediction API enforces rate limits based on the source IP address. Exceeding these limits will result in an HTTP 429 Too Many Requests error.

  • Per Minute: 1,000,000 requests
  • Per Day: 10,000,000 requests

If you anticipate needing higher limits, please contact support.

Performance & Caching

The API backend utilizes caching (via Redis) to optimize performance for frequently accessed models and their associated metadata (like required columns).

When a prediction request comes in:

  • The system first checks the cache for the required model object.
  • If found, the cached version is used directly, significantly reducing load times.
  • If not found, the model is loaded from the primary database and then **asynchronously** stored in the cache for future requests.
  • Similarly, information about required input columns is also cached.

This caching is handled automatically on the server-side to ensure low latency for your prediction requests.

Use Cases

The Prediction API unlocks numerous possibilities for leveraging your models:

Real-time Scoring in Applications

Integrate predictions directly into user-facing web or mobile apps (e.g., predict loan eligibility, recommend products, estimate delivery times based on user input).

Backend Automation

Automate decisions within backend systems (e.g., flag fraudulent transactions, route customer support tickets, prioritize leads based on predicted value).

Data Enrichment Pipelines

Add model predictions as new features to datasets in your data warehouse or ETL processes.

Programmatic A/B Testing

Direct traffic to different deployed model versions via their respective API keys to compare performance in live environments.

Building Custom Tools

Create internal dashboards or tools that allow non-technical users to get predictions without directly accessing the ML platform.

Best Practices & Considerations

Secure Your API Keys

Treat API keys as sensitive credentials. Store them securely (e.g., environment variables, secrets management systems). Do not embed them directly in source code, especially client-side applications.

Implement Robust Error Handling

Your application should gracefully handle potential API errors (4xx, 5xx status codes, network issues). Check the response status code and the `error` message in the JSON body. Implement retry logic with backoff for transient errors (like 500 or network timeouts).

Validate Input Data

Ensure the input_data object sent to the API contains all required features with the correct names and data types expected by the model. Mismatches will lead to 400 errors.

Be Mindful of Rate Limits

Design your application to stay within the specified rate limits. If processing large volumes, consider batching requests on your end or using the platform's Batch Prediction feature if applicable.

Related Concepts

To effectively use the API, understand these related areas:

Model Deployment

Learn how to deploy your trained models to make them accessible via the API and generate API keys.

Manage Deployments

Real-Time Predictions

Explore the UI-based method for making single predictions interactively.

Use the Real-Time UI

Batch Predictions

Learn how to process entire datasets for predictions using the platform's UI.

Explore Batch Processing

Was this page helpful?

Need help?Contact Support
Questions?Contact Sales

Last updated: 5/3/2025

ML Clever Docs