Discover the OVH Prescience APIs

Wissensdatenbanken

Discover the OVH Prescience APIs


Icons/System/eye-open Created with Sketch. 37 Ansichten 26.09.2018 Prescience

Objective

Prescience is an automatic learning tool that can be managed through several APIs to automate a wide range of actions.

This guide is a detailed introduction to those APIs and will show you how to manage your own OVH Prescience platform.

APIURLDescription
Prescience APIhttps://prescience-api.ai.ovh.netAPI that allows to manipulate Prescience’s “sources”, “datasets” and “models”.
Prescience Servinghttps://prescience-serving.ai.ovh.netAPI allows to assess a model that was generated by Prescience.

Authentication

Using Prescience requires an authentication token.

Here is an example of an API call:

curl -X GET "https://prescience-api.ai.ovh.net/project" -H "Authorization: Bearer ${TOKEN}"

OVH Prescience API

Sources

The “source” object is the result of a parsing task (analysis). During the API call, the returned object includes the following items:

RecordDescriptionTypeOrderableFilterable
source_idSource identifierStringYesNo
input_urlInternal URL of the pre-parsing fileStringNoNo
source_urlInternal URL of the pre-parsing fileStringNoNo
input_typeType of source fileStringYesNo
headersThe pre-parsing file contains the headersBooleanYesNo
separatorSeparator of the pre-parsing file if CSVStringNoNo
diagramCharacter string that represents the diagram in JSONStringNoNo
statusSource statusStatusYesNo
last_updateLast updated on 26/09/2018TimestampYesNo
created_atCreation dateTimestampYesNo
total_stepTotal number of steps in the parsing processIntegerNoNo
current_stepCurrent step in the parsing processIntegerNoNo
current_step_descriptionDescription of the current step in the parsing processStringNoNo

Resource list:

GET https://prescience-api.ai.ovh.net/source

Settings:TypeInRequiredDefaultMeaningExample
PageIntegerQueryNo1Page number2
SizeIntegerQueryNo100Number of items per page50
Sort_columnStringQueryNocreated_atField in which results are orderedsource_id
Sort_directionStringQueryNocreated_atField in which results are orderedsource_id

Source retrieval:

GET https://prescience-api.ai.ovh.net/source/{id_source}

Settings:TypeInRequiredDefaultMeaningExample
id_sourceStringPathYesSource identifierma_source

Source deletion:

DELETE https://prescience-api.ai.ovh.net/source/{id_source}

Settings:TypeInRequiredDefaultMeaningExample
id_sourceStringPathYesSource identifierma_source

Datasets

The “dataset” object is the result of a “preprocessing” task. During the API call, the returned object will contain the following items:

RecordDescriptionTypeOrderableFilterable
dataset_idDataset identifierStringYesYes
source“Source” object that generated the datasetSourceNoYes
dataset_urlInternal URL of the file resulting from the preprocessStringNoNo
transformation_urlInternal URL of the transformation PMML fileStringNoNo
label_idIdentifier of the “label” columnStringYesNo
problem_typeType of machine learning problem (“Classification” / “Regression”)StringYesNo
nb_foldNumber of cutoffs done through the preprocessBooleanYesNo
selected_columnsList of columns chosen in the sourceString[]NoNo
diagramCharacter string that represents the diagram in JSONStringNoNo
statusDataset statusStatusYesNo
last_updateLast updated on 26/09/2018TimestampYesNo
created_atCreation dateTimestampYesNo
total_stepTotal number of steps in the preprocessIntegerNoNo
current_stepCurrent step of the preprocess operationIntegerNoNo
current_step_descriptionDescription of the current step in the preprocess operationStringNoNo

List of datasets:

GET https://prescience-api.ai.ovh.net/dataset/

Settings:TypeInRequiredDefaultMeaningExample
PageIntegerQueryNo1Page number2
SizeIntegerQueryNo100Number of items per page50
Sort_columnStringQueryNocreated_atField in which results are orderedsource_id
Sort_directionStringQueryNocreated_atField in which results are orderedsource_id
Dataset_idStringQueryNoFiltering field on the dataset name (search in LIKE mode)dataset
Source_idStringQueryNoFiltering field on the dataset source name (search in LIKE mode)source

Dataset retrieval:

GET https://prescience-api.ai.ovh.net/dataset/{id_dataset}

Settings:TypeInRequiredDefaultMeaningExample
id_datasetStringPathYesDataset identifiermy_dataset

Deleting a dataset:

DELETE https://prescience-api.ai.ovh.net/dataset/{id_dataset}

Settings:TypeInRequiredDefaultMeaningExample
id_datasetStringPathYesDataset identifiermy_dataset

Models

The “model” object is the result of a “train” task. During the API call, the returned object will contain the following items:

RecordDescriptionTypeOrderableFilterable
model_idModel identifierStringYesNo
dataset“Dataset” object that generated the modelDatasetNoYes
label_idIdentifier of the “label” columnStringYesNo
config“Config” object that generated the modelConfigNoNo
statusDataset statusStatusYesNo
last_updateLast updated on 26/09/2018TimestampYesNo
created_atCreation dateTimestampYesNo
total_stepTotal number of steps in the “train” processIntegerNoNo
current_stepCurrent step of the “train” process.IntegerNoNo
current_step_descriptionDescription of the current step of the “train” processStringNoNo

The “config” object describes the configuration used to generate the machine learning model.

RecordDescriptionType
nameName of the algorithm usedString
class_identifierInterne identifierString
kwargsModel hyparametersDictionary

Model list:

GET https://prescience-api.ai.ovh.net/model

Settings:TypeInRequiredDefaultMeaningExample
PageIntegerQueryNo1Desired page number2
SizeIntegerQueryNo100Number of desired items per page50
Sort_columnStringQueryNocreated_atField in which results are orderedmodel_id
Sort_directionStringQueryNocreated_atField in which results are orderedmodel_id
Dataset_idStringQueryNoFiltering field on the dataset name (search in LIKE mode)dataset

Model retrieval:

GET https://prescience-api.ai.ovh.net/model/{id_model}

Settings:TypeInRequiredDefaultMeaningExample
id_modelStringPathYesModel identifiermy_model

Deleting a model:

DELETE https://prescience-api.ai.ovh.net/model/{id_model}

Settings:TypeInRequiredDefaultMeaningExample
id_modelStringPathYesModel identifiermy_model

Parsing

To create a “source”, you need to launch a parsing task.

POST https://prescience-api.ai.ovh.net/ml/upload/source

Settings:TypeInRequiredDefaultMeaningExample
parse.source_idStringMultipart parse JSONYesSource namemy-source
parse.input_typeStringMultipart parse JSONYesCSV or Parquet file format onlyCSV
parse.separatorStringMultipart parse JSONNo,Separator in the case of a CSV file;
filesFilesMultipart input-file-file-index nameNoFile to upload (may contain several)input-file-0

For example:

Assuming that the “data-1.csv” and “data-2.csv” CSV files are in the same directory:

parse.json file

{
    "source_id": "my-source",
    "input_type": "csv",
    "separator": ","
}
curl -H "Authorization: Bearer ${TOKEN}" -v \
    -F parse='@parse.json;type=application/json' \
    -F input-file-1=@data-1.csv \
    -F input-file-2=@data-2.csv \
    https://prescience-api.ai.ovh.net/ml/upload/source

The source that was sent back in the response is incomplete. Since the task is asynchronous, it will be completed as it progresses.

Preprocess

To create a "dataset", you must first have generated a "source", and then have created a preprocess task.

POST https://prescience-api.ai.ovh.net/ml/preprocess/{source_id}

Settings:TypeInRequiredDefaultMeaningExample
source_idStringQueryYesName of the source to be parsedmy-source
dataset_idStringBody JSONYesName of the future datasetmy-big-dataset
label_idStringBody JSONYesIdentifier of the column of the dataset to be labelledmy-label
nb_foldStringBody JSONNo10Number of folds to create during parsing6
problem_typeStringBody JSONYesType of machine learning problem (classification/ Regression)regression
selected_columnsString[]Body JSONNo[]Selecting columns for the dataset. By default, all columns are selected["colonne_1", "colonne_2"]

For example:

preprocess.json file

{
    "dataset_id": "my-dataset",
    "label_id": "my-label",
    "problem_type": "classification"
}
curl -H "Authorization: Bearer ${TOKEN}" \
     -H "Content-Type:application/json" \
     -X POST https://prescience-api.ai.ovh.net/ml/preprocess/ma-source \
     --data-binary "@preprocess.json"

The dataset that was sent back in the response is incomplete. Since the task is asynchronous, it will be completed as it progresses.

Optimisation

Once the dataset has been created, it is possible to start optimising it.

POST https://prescience-api.ai.ovh.net/ml/optimize/{dataset_id}

Settings:TypeInRequiredDefaultMeaningExample
dataset_idStringQueryYesName of dataset to be optimisedmy-big-dataset
scoring_metricStringBody JSONYesOptimisation metric (Regression: mae/mse / R2, Classification : accuracy, f1, roc_auc)my-source
budgetIntegerBody JSON6Budget allocated to optimisation10

For example:

optimize.json file

{
    "scoring_metric": "roc_auc",
    "budget": 6
}
curl -H "Authorization: Bearer ${TOKEN}" \
     -H "Content-Type:application/json" \
     -X POST https://prescience-api.ai.ovh.net/ml/optimize/my-big-dataset \
     --data-binary "@optiumize.json"

The optimisation task returns an object called "Optimization". Once the optimisation is complete, it will be possible to run a query on the "Evaluation-Result" objects to obtain the best possible configuration.

Evaluation Result

The "Evaluation-Result" object is the result of an optimisation task. During the API call, the returned object will contain the following items:

RecordDescriptionType
uuidUUID of evaluationInteger
spent_timeTime spent evaluating the configurationInteger
costsDictionary containing the metrics associated with the configurationDict{}
configTested configurationConfig
statusDataset statusStatus
last_updateLast updated on 26/09/2018Timestamp
created_atCreation dateTimestamp
total_stepTotal number of steps in the optimisation processInteger
current_stepCurrent step of the optimisation process.Integer
current_step_descriptionDescription of the current step of the optimisation processString

Evaluation list:

GET https://prescience-api.ai.ovh.net/evaluation-result

Settings:TypeInRequiredDefaultMeaningExample
Dataset_idStringQueryYesFiltering of evaluations on the datasetmy-big-dataset
PageIntegerQueryNo1Desired page number2
SizeIntegerQueryNo100Number of desired items per page50
Sort_columnStringQueryNocreated_atField in which results are orderedsource_id
Sort_directionStringQueryNocreated_atField in which results are orderedsource_id
StatusStringQueryNoFiltering data based on statusBUILT

Training

After choosing the best configuration from the list of "Evaluation-Results" we can train a model:

POST https://jedison.ai.ovh.net/ml/train

Settings:TypeInRequiredDefaultMeaningExample
model_idStringQueryYesName of the future modelmy-model
evaluation_uuidStringQueryYesEvaluation-Result identifierbcaef619-4bf3-4c15-b49f-bc325f98d891
dataset_idStringQueryNodataset_id linked to Evaluation-ResultTo be completed if training with a dataset different than the Evaluation-Result datasetmy-alternative-dataset

For example:

curl -H "Authorization: Bearer ${TOKEN}" \
     -H "Content-Type:application/json" \
     -X POST https://prescience-api.ai.ovh.net/ml/train/?model_id=mon-model&evaluation_uuid=bcaef619-4bf3-4c15-b49f-bc325f98d891 \

The training task returns an incomplete model object. Indeed, since the task is asynchronous, it will be completed as it progresses.

OVH Prescience Serving API

Model description:

Once a model is trained, it can be used to make inferences.

Both APIs have a "model" object. These do not have the same structure. Only the model_id identifier is common to both.

Model description:

POST https://prescience-serving.ai.ovh.net/model/{model_id}

The returned object describes the "model" object according to Prescience Serving.

Example of result:

{
    "id": "model",
    "properties": {
        "created.timestamp": 1537170170985,
        "accessed.timestamp": null,
        "file.size": 3737,
        "file.md5sum": "a13e6e482bb2e62d1376b502f8cbc8a2"
    },
    "schema": {
        "argumentsFields": [{
            "id": "hours-per-week",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "capital-gain",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "education-num",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "age",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "fnlwgt",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "capital-loss",
            "dataType": "integer",
            "opType": "ordinal"
        }],
        "transformFields": [{
            "id": "imputed_hours-per-week",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "imputed_capital-gain",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "imputed_education-num",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "imputed_age",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "imputed_fnlwgt",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "imputed_capital-loss",
            "dataType": "integer",
            "opType": "ordinal"
        }, {
            "id": "scaled_imputed_hours-per-week",
            "dataType": "double",
            "opType": "continuous"
        }, {
            "id": "scaled_imputed_capital-gain",
            "dataType": "double",
            "opType": "continuous"
        }, {
            "id": "scaled_imputed_education-num",
            "dataType": "double",
            "opType": "continuous"
        }, {
            "id": "scaled_imputed_age",
            "dataType": "double",
            "opType": "continuous"
        }, {
            "id": "scaled_imputed_fnlwgt",
            "dataType": "double",
            "opType": "continuous"
        }, {
            "id": "scaled_imputed_capital-loss",
            "dataType": "double",
            "opType": "continuous"
        }]
    }
}

Model evaluation

During the preprocessing stage, a data transformation is performed. Since the model is based on the output of this transformation, it is imperative that the data is transformed before using the model. Prescience Serving provides methods of performing both this transformation and the inference.

The Serving platform allows you to perform the following:

  • Transformation and evaluation
  • Evaluation only
  • Transformation only
MethodURLDescription
POSThttps://prescience-serving.ai.ovh.net/eval/{model_id}/modelUnit inference
POSThttps://prescience-serving.ai.ovh.net/eval/{model_id}/model/batch/csvBatch inference from a CSV file
POSThttps://prescience-serving.ai.ovh.net/eval/{model_id}/model/batch/jsonBatch inference from a JSON table
POSThttps://prescience-serving.ai.ovh.net/eval/{transform_id}/transformUnit transformation
POSThttps://prescience-serving.ai.ovh.net/eval/{transform_id}/transform/batch/csvBatch transformation from a CSV file
POSThttps://prescience-serving.ai.ovh.net/eval/{transform_id}/transform/batch/jsonBatch transformation from a JSON table
POSThttps://prescience-serving.ai.ovh.net/eval/{transform_model_id}/transform-modelTransformation associated with the model and unit inference
POSThttps://prescience-serving.ai.ovh.net/eval/{transform_model_id}/transform-model/batch/csvBatch transformation associated with the model and inference from a CSV file
POSThttps://prescience-serving.ai.ovh.net/eval/{transform_model_id}/transform-model/batch/jsonBatch transformation associated with the model and inference from a JSON table
Settings:TypeInRequiredDefaultMeaning
idStringJSONNoQuery identifier
argumentsDictJSONYesQuery arguments

Example of unit inference:

example.json file

{
    "arguments": {
        "hours-per-week": 1,
        "capital-gain": 1,
        "education-num": 1,
        "age": 1,
        "fnlwgt": 1,
        "capital-loss": 1
    }
}

Query

curl -H "Authorization: Bearer ${TOKEN}" \
     -H "Content-Type:application/json" \
     -X POST https://prescience-serving.ai.ovh.net/eval/mon-model/transform-model \
     --data-binary "@example.json"

Example of the evaluation of a JSON batch:

example.json file

[
    {
        "id": "eval-1",
        "arguments": {
            "hours-per-week": 1,
            "capital-gain": 1,
            "education-num": 1,
            "age": 1,
            "fnlwgt": 1,
            "capital-loss": 1
        }
    },
    {
        "id": "eval-2",
        "arguments": {
            "hours-per-week": 1,
            "capital-gain": 1,
            "education-num": 1,
            "age": 1,
            "fnlwgt": 1,
            "capital-loss": 1
        }
    }
]

Query

curl -H "Authorization: Bearer ${TOKEN}" \
     -H "Content-Type:application/json" \
     -X POST https://prescience-serving.ai.ovh.net/eval/mon-model/transform-model/batch/json \
     --data-binary "@example.json"

Go further

Join our community of users on https://community.ovh.com/en/.

Zugehörige Artikel