Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This short tutorial shows how to store data into the property service with python and requests (see also Access to REST-Services in Python).

For IDL you can take a similar rout by adapting these instructions to Access to REST-Services in IDL

Routes

...

The Property Service

The property service represents a web interface which allows to request, insert and modify property data from the database. Operations are performed by sending URL requests, whereas each operation is well defined as a so called route. The property service comes along with a graphical user interface at http://localhost:8002/ui/). Under the Edit tab there are all routes which can be used to insert or update data of the property service.

 

Image Removed

 

which provides visual access to all available routes. Hereby, all routes involving the insertion or modification of data are enlisted under the Edit section.

Swagger APIImage Added

For this tutorial we are using just two of themonly two routes. One to add a new datasets and one to add new regions to the property servicea new region with properties.

  • /dataset/bulk
  • /region/{dataset}

Requests

These two routes are both POST routes which requires them to be called with a POST request. To create simple post requests we recommend the python package requests.

A POST request needs two parameter: One is the address itself which is given by the route and the second one is the data which will be attached to the request.

Code Block
languagepy
r = requests.post("http://httpbin.org/post", json = {"key":"value"})

Implementation

Prepare

First of all you have to define the data you would like to store into the property service. Each route can hold up to three different parameter types which we discuss in detail on the basis of the route /region/{dataset}.

Firstly, the definition of a generic URL and its fragments:

Panel
titleGeneric URL structure
https://localhost:8002/region/dataset1?algorithm_run_id=1
\___/ \_______/ \__/\______________/ \________________/
| | | | |
Schema Host Port Path Query

When we look at the definition of the route /region/{dataset} we can find three parameters of three different types (path, body, query):

Image Added

Following, a definition of the types:

Parameter Type

Description

Path

Parameter is part of the URL’s path.
Here, {dataset} is a path parameter.

Query

Parameter is part of the URL’s query.
Here, the {algorithm_run_id} is a query parameter.

Body

Parameter is part of the payload that is appended to the HTTP request.
Here, the {property_group_data} is a body parameter.

For a more technical view, one can push the “Try it out!” button. Hereby, the HTTP request for CURL is generated, e.g.:

Panel
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{ \
   "data": {}, \
   "lat_hg": 0, \
   "long_hg": 0, \
   "time_start": "2016-08-26T06:29:01.424Z" \
 }' 'http://localhost:8002/region/dataset1?algorithm_run_id=1'

Implementation

Prepare

In our example we are going to store some machine learning results into the provenance dataset ml-algorithms.

Code Block
languagepy
# define data to store
ml_datasets = [
  	{
   	 	"name": "ml-algorithms",
    	"responsible": "John Doe",
    	"type": "algorithm",
    	"description": "no comment!"
  	}
]
ml_result = {
    'time_endstart': '2016-01-22T17:27:59.001Z',
    'lat_hg': 12,
    'long_hg': 4.2,
    'algorithm_name': 'LassoCV',
    'algorithm_parameters': {
        'store': 'what you want',
        'also here': [1, 2, 3, 4]
    }
}

Now the problem is that the Unfortunately, the above given structure ml_result is does not structured in the way that the property service would understand it. So we have to restructure it to accomplish yet fit the model for the property_group_data definition as given by the route /region/{dataset}:

Image Added

A restructuration is required to accomplish the following definition:

  • Every region needs at least a time_start, lat_hg and long_hg attribute.
  • Every specific data property has to be added to the data attribute and has to be a key-value pair.

...

Code Block
languagepy
# define data to store
post_data = {
    'time_start': '2016-01-21T17:27:59.001Z',
    'lat_hg': 12,
    'long_hg': 4.2,
    'data': {
	    "LassoCV": {
			'store': 'what you want',
	        'also here': [1, 2, 3, 4]
		}
    }
}

Ingest

Now Finally, the ingest of this post_data is is very simple. We first have to add the new dataset to the property service and then add a new region, using the routes as enlisted above.

Code Block
languagepy
# add provenancedataset
print('creating provenancedataset...')
requests.post("http://localhost:8002/dataset/bulk", json=ml_datasets)

# add region
print('storing data...')
requests.post("http://localhost:8002/region/%s" % ml_datasets[0]['name'], json=post_data)

The addresses of the routes are the same we looked up before.

Retrieve

Now to check if everything worked we should retrieve all the regions which are stored under this dataset. This can be done with a GET request.

Code Block
languagepy
# retrieving data
print('downloading all properties...')
ml_regions = requests.get("http://localhost:8002/region/%s/list" % ml_datasets[0]['name']).json()

print(ml_regions)

Source Code

Here you can download the full python source code.

requestingest_ingestproperty_data.py