Ingest prediction data in database (REST API)

This short tutorial shows how to store data into the prediction service with python and requests.

For IDL you can take a similar rout by adapting these instructions to Access to REST-Services in IDL.

Routes

First of all we have to know which routes we need to use. For this it is recommended to take a look at the swagger ui (http://localhost:8004/ui/). Under the Edit tab there are all routes which can be used to insert or update data of the prediction service.

For this tutorial we are using just two of them. One to add new datasets and one to add new configuration sets to the prediction service.

/dataset/bulk
/configset/{dataset}

Requests

These two routes are both POST routes which requires them to be called with a POST request. To create simple post requests we recommend the python package requests.

A POST request needs two parameter: One is the address itself which is given by the route and the second one is the data which will be attached to the request.

r = requests.post("http://httpbin.org/post", json = {"key":"value"})

Implementation

Prepare

First of all you have to define the data you would like to store into the prediction service. In our example we are going to store some predictions from a machine learning algorithm into the dataset ml-algorithms.

# define data to store
ml_datasets = [
  	{
   	 	"name": "ml-algorithms",
    	"responsible": "John Doe",
    	"description": "no comment!"
  	}
]

ml_configurations = {
    'model': { '...': '...' },
    'weight-matrix': { '...': '...' },
    'biases': { '...': '...' }
}

ml_result = {
    'name': 'ml-multilayer-perceptron',
    'timestamp': '2016-01-22T17:27:59.001Z',
    'configurations': ml_configurations,
    'predictions': {
        'time-left': '32h12m04s',
        'position': {
            'lat_hg': 14.0,
            'long_hg': -3.4
        },
        'probability': '88%',
        'class': 'M'
    }
 }

Now the problem is that the ml_result is not structured in the way that the prediction service would understand it. So we have to restructure it to accomplish following definition:

Every configuration set needs at least a name attribute.
Every specific configuration has to be added to the config_data attribute and has to be a key-value pair.
Every specific prediction has to be added to the prediction_data attribute and has to be a key-value pair.

This would then look like this:

# define data to store
post_data = {
    'name': 'ml-multilayer-perceptron',
    'timestamp': '2016-01-21T17:27:59.001Z',
    'config_data': ml_configurations,
    'prediction_data': {
        'time-left': '32h12m04s',
        'position': {
            'lat_hg': 14.0,
            'long_hg': -3.4
        },
        'probability': '88%',
        'class': 'M'
    }
}

Ingest

Now the ingest of this post_data is very simple. We first have to add the new dataset to the prediction service and then add a new configuration set.

# add dataset
print('creating dataset...')
requests.post("http://localhost:8004/dataset/bulk", json=ml_datasets)

# add configuration set
print('storing data...')
requests.post("http://localhost:8004/configset/%s" % ml_datasets[0]['name'], json=post_data)

The addresses of the routes are the same we looked up before.

Retrieve

Now to check if everything worked we should retrieve all the configuration sets which are stored under this dataset. This can be done with a GET request.

# retrieving data
print('downloading all data...')
ml_config_sets = requests.get("http://localhost:8004/configset/%s/list" % ml_datasets[0]['name']).json()

print(ml_config_sets)

Source Code

Here you can download the full python source code.

prediction_request_ingest.py

This page was adopted from Ingest property data in database (REST API).

Space shortcuts

Page tree