Model Inferences

Inferences are how you request model predictions of annotations on your source files

All model inference requests are asynchronous, meaning you must make the request and then poll for the status.

Request Inference


Creates an model inference job to be run on one or more source files


Request Body

    "inferenceJobId": 12,
    "status": "Queued",
    "projectName": "Sample Project",
    "projectId": 1,
    "outputLayerName": "Gold Set",
    "outputLayerId": 12,
    "sourceIds": [3240, 4414],

Request Inference Status


Returns the status of the inference job

Path Parameters


    "inferenceJobId": 12,
    "status": "Queued",
    "projectId": 1,
    "sourceIds": [3240, 4414]

This code shows how to call a specific model on 2 sources, poll status until the model inference is complete, then retrieve the results.

import requests
import time


inferenceBody = {
  'groupName': 'Company Name',
  'projectIdentifier': 'My Project',
  'sourceIds': [4024, 5853],
  'modelIdentifier': 'Staple + Classify Documents',
  'outputLayerIdentifier': 'Gold Set'

headers = {
  'Authorization': 'Api-Key '+ANNO_LAB_API_KEY,

url = ''

response =, headers=headers, json=inferenceBody)


get_url = ''+response.json()['inferenceJobId']
maximum_timeout_seconds = 1800
time_taken = 0 
inference_is_finished = False

start_time = time.time()
while not inference_is_finished and time_taken < maximum_timeout_seconds:
  status_response = requests.get(get_url, headers=headers, json=inferenceBody).json()
  if status_response['status'] in ['Finished', 'Errored']:
    print("Inference Finished")
    inference_is_finished = True
  time_taken = time.time() - start_time

Last updated