An Embulk plugin to ingest json records from REST API with transformation by jq.
- Plugin type: input
- Resume supported: yes
- Cleanup supported: yes
- Guess supported: no
- scheme: URI Scheme for the endpoint (string, default:
"https", allows:"https","http") - host: Hostname or IP address of the endpoint (string, required)
- port: Port number of the endpoint (integer, optional, allows:
0-65535) - path: Path of the endpoint (string, optional)
- headers: HTTP Headers (array of map, optional, allows: 1 element can contains 1 key-value.)
- method: HTTP Method (string, default:
"GET", allows:"GET","POST","PUT","PATCH","DELETE","GET","HEAD","OPTIONS","TRACE","CONNECT") - params: HTTP Request params. This is merged with params for pagenation when the
pageroption is specified. (array of map, optional, allows: 1 element can contains 1 key-value.) - body: HTTP Request body. (json, optional)
- success_condition: jq filter to check whether the response is succeeded or not. You can use
jqto query for the status code and the response body. (string,".status_code_class == 200") - transformer: jq filter to transform the api response json. (string,
"[.response_body]") - extract_transformed_json_array: If true, the plugin extracts the transformed json array, and ingest them as records. (boolean, default:
true) - pager: (the following options are acceptable, default:
{})- initial_params: Additional HTTP Request params that is used the first request. (array of map, optional, allows: 1 element can contains 1 key-value.)
- next_params: Additional HTTP Request params that is used the subsequent requests. The value is treated as a
jqfilter to transform the prior response. (array of map, optional, allows: 1 element can contains 1 key-value.) - next_body_transformer: jq filter to transform the prior response to the next request body. (string, default:
".request_body") - while: jq filter to check whether the pagination is required or not. You can use
jqto query for the status code and the response body. (string,"false") - interval_millis: Interval in milliseconds between requests. (integer, default:
100)
- retry: (the following options are acceptable, default:
{})- condition: jq filter to check whether the response is retryable or not. This condition will be used when it is determined that the response is not succeeded by
success_condition_jq. You can usejqto query for the status code and the response body. (string,"true") - max_retries: Maximum retries. (integer, default:
7) - initial_interval_millis: Initial retry interval in milliseconds. (integer, default:
1000) - max_interval_millis: Maximum retries interval in milliseconds. (integer, default:
60000)
- condition: jq filter to check whether the response is retryable or not. This condition will be used when it is determined that the response is not succeeded by
- show_request_body_on_error: Show request body on error. (boolean, default:
true) - default_timezone: Default timezone. (string, default:
"UTC") - default_timestamp_format: Default timestamp format. (string, default:
"%Y-%m-%d %H:%M:%S %z") - default_date: Default date. (string, default:
"1970-01-01")
About the jq filter
The following options accept the jq filter to transform the api response json.
- success_condition
- transformer
- pager/next_params
- pager/next_body_transformer
- retry/condition
All of the jq filters transform json that has the same format as the following.
{
"request_params": [
{"name": "foo", "value": "bar"}
],
"request_body": {
"foo": "bar"
},
"status_code": 201,
"status_code_class": 200,
"response_body": {
"foo": "bar",
"results": [
{"id": 1, "name": "foo"},
{"id": 2, "name": "bar"}
]
}
}The response of api is stored as the "response_body" field, so please note that the jq filter definition must start with .response_body in order to perform jq transformations on the API response results.
in:
type: http_json
scheme: http
host: localhost
port: 8080
path: /example
method: GET
transformer: '.response_body.integerValues'
success_condition: '.status_code_class == 200'
out:
type: stdoutFirstly, you need to start the mock server.
$ ./example/run-mock-server.shthen, you run the example.
$ ./gradlew gem
$ embulk run -Ibuild/gemContents/lib -X min_output_tasks=1 example/config.ymlThe requested records are shown on the mock server console.
$ ./gradlew test$ ./gradlew gem # -t to watch change of files and rebuild continuously
$ ./gradlew dependencies --write-locks## Just check the format violations
$ ./gradlew spotlessCheck
## Fix the all format violations
$ ./gradlew spotlessApplyA new tag is pushed, then a new gem will be released. See the Github Action CI Setting.
See. Github Releases