# Source Files

## Retrieve source information

<mark style="color:blue;">`GET`</mark> `https://api.annolab.ai/v1/source/{source_id}`

Returns basic source information, including a signed URL to download the original file and its tags.

#### Query Parameters

| Name                                         | Type | Description      |
| -------------------------------------------- | ---- | ---------------- |
| source\_id<mark style="color:red;">\*</mark> | Int  | Id of the source |

#### Headers

| Name                                            | Type   | Description                                                                              |
| ----------------------------------------------- | ------ | ---------------------------------------------------------------------------------------- |
| Authorization<mark style="color:red;">\*</mark> | String | <p>Your API key<br><code>{"Authorization": "Api-Key XXXXXXX-XXXXXXX-XXXXXXX"}</code></p> |

{% tabs %}
{% tab title="200: OK " %}

{% endtab %}
{% endtabs %}

```json
// Example Response
{
    "id": 2,
    "projectName": "My Project",
    "projectId": 1,
    "directoryName": "Uploads",
    "directoryId": 1,
    "name": "REGISTRATION.pdf",
    "sourceName": "REGISTRATION.pdf",
    "type": "pdf",
    "text": "Example PDF Text",
    "url": "https://download-example-url.pdf",
    "createdAt": "2023-05-22T20:32:02.633Z",
    "tags": [
        {
            "domainEntityId": 1,
            "typeName": "Airframe Inventory",
            "attributes": [
                {
                    "name": "Make",
                    "value": "CESSNA"
                },
                {
                    "name": "Model",
                    "value": "421C"
                },
                {
                    "name": "Serial Number",
                    "value": "421C-5837"
                }
            ],
            "createdBy": {
                "id": 58473,
                "email": "testuser@gmail.com",
                "username": "testuser"
            }
        }
    ]
}
```

## Upload a PDF

<mark style="color:green;">`POST`</mark> `https://api.annolab.ai/v1/source/upload-pdf`

Upload a PDF and specify an OCR method to apply. (optional) invoke a workflow of AI models

#### Headers

| Name                                            | Type   | Description                                                                                                                                                           |
| ----------------------------------------------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Authorization<mark style="color:red;">\*</mark> | String | <p>Where you put your api key. Creating a directory requires a key with "Write" permissions.<br><code>{"Authorization": "Api-Key XXXXXXX-XXXXXXX-XXXXXXX"}</code></p> |

#### Request Body

| Name                                                | Type            | Description                                                                                                                                        |
| --------------------------------------------------- | --------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| projectIdentifier<mark style="color:red;">\*</mark> | string\|number  | Either id of the project or name of the project where file will reside                                                                             |
| directoryIdentifier                                 | string          | name of the directory where the file will reside                                                                                                   |
| sourceIdentifier<mark style="color:red;">\*</mark>  | string          | Name of the source that will be created                                                                                                            |
| ocrProvider                                         | string          | Only used if processMode is set to OCR. Valid values are "textract", "textract\_plus", and "gcv". "textract\_plus" recommended for highest quality |
| preprocessor                                        | string          | Valid options are "faa" and None.                                                                                                                  |
| groupName<mark style="color:red;">\*</mark>         | string          | Name of the group that owns the project                                                                                                            |
| tags                                                | CanonicalTag\[] | Array of [CanonicalTag](/annotations-and-relations/canonical-tags.md#canonicaltag-object) objects                                                  |
| workflow                                            | string          | Workflow (aka package of AI models) that will be invoked immediately after upload. Recommend "FAA\_CD\_WITH\_VLM" or "FAA\_CD\_WITH\_TAGGING"      |
| processMode                                         | string          | Use "OCR" if the pdf is not already text enriched. Use "EXTRACT" if pdf already has text embedded                                                  |

{% tabs %}
{% tab title="Python" %}

```python
import os
import json
import requests

ANNO_LAB_API_KEY = 'XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX'

url_base = 'https://api.annolab.ai'

input_pdf = '/Users/grantdelozier/devel/ocr-these3/TEST-REGISTRATION.PDF'

headers = {
  'Authorization': 'Api-Key '+ANNO_LAB_API_KEY,
}

url = url_base+'/v1/source/create-pdf'

requestBody = {
  'groupName': 'AnnoLab',
  'projectIdentifier': 'title-demo',
  'directoryIdentifier': 'testing',
  'sourceIdentifier': 'TEST-REGISTRATION.PDF',
  'preprocessor': 'faa',
  'processMode': 'OCR',
  'ocrProvider': 'textract_plus',
  'workflow': 'FAA_CD'
}

fileToUpload = {
  'file': ('TEST-REGISTRATION.PDF', open(input_pdf, 'rb'), 'application/pdf')
}

url = url_base+'/v1/source/upload-pdf'

response = requests.post(url, headers=headers, data=requestBody, files=fileToUpload)
print(response.json())

```

{% endtab %}
{% endtabs %}

## Create Source Text

<mark style="color:green;">`POST`</mark> `https://api.annolab.ai/v1/source/create-text`

Create a new text file source within a directory.

#### Headers

| Name                                            | Type   | Description                                                                                                                                                           |
| ----------------------------------------------- | ------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Authorization<mark style="color:red;">\*</mark> | string | <p>Where you put your api key. Creating a directory requires a key with "Write" permissions.<br><code>{"Authorization": "Api-Key XXXXXXX-XXXXXXX-XXXXXXX"}</code></p> |

#### Request Body

| Name                | Type   | Description                                                                                      |
| ------------------- | ------ | ------------------------------------------------------------------------------------------------ |
| projectIdentifier   | string | Identifier for the project that will contain the source file. Either the id or the unique name   |
| directoryIdentifier | string | Identifier for the directory that will contain the source file. Either the id or the unique name |
| sourceName          | string | Name of the file you wish to create                                                              |
| text                | string | Text that exists within the file                                                                 |

{% tabs %}
{% tab title="201 Source text was successfully created" %}

```
{
    "sourceName": "athens.txt,
    "directoryName": "Wikipedia Subset",
    "directoryId": 12,
    "projectName": "New NER Project",
    "projectId": 22,
    "id": 145
}
```

{% endtab %}
{% endtabs %}

This code shows how to create a new text file source

{% tabs %}
{% tab title="Python" %}

```python
import requests

ANNO_LAB_API_KEY = 'XXXXXXX-XXXXXXX-XXXXXXX-XXXXXXX'

source = {
  'projectIdentifier': 'New NER Project',
  'directoryIdentifier': 'Wikipedia Subset',
  'sourceName': 'athens.txt'
  'text': 'Athens (Greek: Αθήνα, Athína), is the capital city of Greece with a metropolitan population of 3.7 million inhabitants.'
}

headers = {
  'Authorization': 'Api-Key '+ANNO_LAB_API_KEY,
}

url = 'https://api.annolab.ai/v1/source/create-text'

response = requests.post(url, headers=headers, json=source)

print(response.json())
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.annolab.ai/projects-directories-and-sources/source-files.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
