EGE Conversion and Validation RESTful Web Service interface

1. Introduction

EGE RESTful Web Service is a simple web application which exposes the EGE functionality as a set of RESTful online services.

Assuming that the EGE RESTful Web Service has been installed under 'ege-webapp' name in the application/servlet container, the following URL syntax should be used to perform operations:

  • http://[server_adress]:[port]/ege-webapp/[resources]

The EGE RESTful Web Service supports the following operations (request parameters are specified in the [resources] part):

  • Conversion operations
    1. List input data types : http://[server_adress]:[port]/ege-webapp/Conversions/ - lists all the supported input data types.
    2. List conversion paths: http://[server_adress]:[port]/ege-webapp/Conversions/[input data type] - lists all the possible conversion paths for the given [input data type].
    3. Convert: http://[server_adress]:[port]/ege-webapp/Conversions/[conversion path] - performs conversion of the given input data according to the given [conversion path].
  • Validation operations
    1. List input data types : http://[server_adress]:[port]/ege-webapp/Validation/ - lists all the supported input data types.
    2. Validate: http://[server_adress]:[port]/ege-webapp/Validation/[input data type] - performs validation of the given input data.

The following sections describe in detail the above operations, in particular the syntax of the [input data type] and the [conversion path] parts.

2. Conversion

2.1. 'List input data types' operation

This operation lists all data types (called input data types) which can be converted by this particular instance of the EGE RESTful Web Service. The operation should be invoked using the GET method of the HTTP protocol. In order to invoke this operation, use the following URL (we assume that the EGE RESTful Web Service is installed under 'ege-webapp' path of the application/servlet container):

  • http://[server_address]:[port]/ege-webapp/Conversions/

where [server_address] is the address of a server on which the EGE RESTful application is running, and the [port] is the port number of the server.

The following responses are possible:

  • status code 200 - when operation has been performed successfully and there are available conversions (there are certain input data types);
  • status code 204 - when no input data types are found - there are no conversions available;
  • status code 405 - 'wrong method' error, when requested URL suggests conversion operation.

For example, for the following request:

  • http://localhost:8080/ege-webapp/Conversions/

the following XML response is provided:

<?xml version="1.0" encoding="UTF-8"?>
<input-data-types xmlns:xlink="http://www.w3.org/1999/xlink">
	<input-data-type id="TEI,text/xml" xlink:href="http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/" />
	<input-data-type id="MASTER,text/xml" xlink:href="http://localhost:8080/ege-webapp/Conversions/MASTER%3Atext%3Axml/" />
	<input-data-type id="EAD,text/xml" xlink:href="http://localhost:8080/ege-webapp/Conversions/EAD%3Atext%3Axml/" />
</input-data-types>
		

Response data includes a list of nodes describing each supported input data type. The following attributes are available for each input data type:

  • id - the name of the input data type which should be used in further requests
  • xlink:href - reference to the EGE RESTful operation - this operation will list all possible conversion paths for this input data type (identified by the 'id' value).

2.2 List conversion paths' operation

This operation provides a list of all conversions paths for the specified input data type. This operation is also invoked by the GET method of the HTTP protocol. To obtain a list of conversions paths use the following URL :

  • http://[server_address]:[port]/ege-webapp/Conversions/[input_data_type]

where [input_data_type] is the input data type identifier enclosed in the response of the "list input data types" operation.

Note: the [input_data_type] section of the URL contains code '%3A', which is an encoded ':' - colon sign. Colon sign separates every key part of data type from each other.
For example :

  • In URL address - http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/, we can see that [input_data_type] has data type of TEI(text/xml) in a specially encoded format to fit the URI syntax requirements. First part is the format name, which is in our case 'TEI', then we have the first section of the mime type - 'text' and the second section - 'xml'.

The following responses are possible :

  • status code 200 - when the operation has been performed successfully and there exist conversions paths for a given input data type;
  • status code 204 - when no conversions paths are found for a given input data type;
  • status code 405 - 'wrong method' error, when the requested URL suggests "conversion" operation;

For example, for the following request:

  • http://localhost:8080/ege-webapp/Conversions/TEI;text,xml/

the following XML could be a response :

<?xml version="1.0" encoding="UTF-8"?>
<conversions-paths xmlns:xlink="http://www.w3.org/1999/xlink">
	<conversions-path xlink:href="http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/TEI%3Aapplication%3Ax-latex/" >
		<path-name><![CDATA[ 
			I:TEI,text/xml/O:TEI,application/x-latex(TEI Converter) 
		]]></path-name>
		<conversion index="0" >
			<property id="pl.psnc.dl.ege.tei.profileNames">
				<value><![CDATA[default,enrich,iso]]></value>
				<type>array</type>
				<property-name>Profile</property-name>
			</property>
		</conversion>
	</conversions-path>
</conversions-paths>

The response contains a list of conversion paths available for a given input data type. Each conversion-path have the following attribute:

  • xlink:href - reference to EGE RESTful operation of conversion,

and additional sub-nodes :

  • path-name - which is a name of a conversion path;
  • conversion - which represents one conversion action (being a part of the conversion path).

Each 'conversion' node (that represents conversion action) contains attribute 'index' - which is a number of the conversion action in sequence of conversions.

Additionally, every 'conversion' node can have one or more sub-nodes named 'property' which describe the possible parameters for a conversion action. Each 'property' node contains an attribute 'id' which identifies the property, and the following sub-nodes:

  • definition - possible parameters, e.g. accepted property values;
  • type - data type of the property;
  • property-name - label of the property.

2.3. 'Convert' operation.

This operation performs the conversion of a given file using the selected conversion path. This operation can be executed only through the POST method of the HTTP protocol. To perform the conversion of a file the client application has to send the POST request using the URL:

  • http://[server_address]:[port]/ege-webapp/Conversions/[conversion path]

where [conversion path] is a sequence of data types (the first one is the input data type), so

  • [conversion path]=[input data type]/[data_type_1]/[data_type_2]/.../[data_type_N]/ .

The request has to enclose the file for conversion and the optional request parameter named 'properties' which specifies the properties for concrete conversion actions in the conversion path. The [conversion path] should be understood as a sequence of conversion actions. For instance, the following conversion path:

  • TEI%3Atext%3Axml/TEI%3Aapplication%3Amsword/

defines one conversion action - conversion of the file in the TEI(text/xml) data type into a file in the TEI(application/msword) data type which is the result of the conversion operation.

Another example :

  • EAD%3Atext%3Axml/TEI%3Aapplication%3Amsword/TEI%3Aapplication%3Apdf/

defines two conversion actions: the first one converts the EAD(text/xml) input file into a TEI(application/msword) file, the second one converts the resulting TEI(application/msword) file into a TEI(application/pdf) file which is the final result of the conversion operation.

Basically, each pair of data type elements in the conversion path is a conversion action to be performed on the given file.

Optional parameter 'properties' should contain XML data similar to the following:

<conversions>
	<conversion index="[position of conversion action]" >
			<property id="[id of property]">
				[assigned value of this property]
			</property>
	</conversion>
</conversions>

where every 'conversion' node has the attribute 'index' - which represents the number of the conversion action within the conversion sequence.

Each 'conversion' node can contain sub-nodes named 'property' with the attribute 'id' and the value assigned to it. All possible properties for particular conversion action can be obtained from the "list conversion paths" operation.

For the 'conversion' operation, the following responses are possible:

  • 200 status code - conversion was performed without any problems and converted file was returned;
  • 400 status code - returned when requested conversion path does not exist;
  • 405 status code - 'wrong method' error, when trying to perform conversion operation over GET method of HTTP protocol;

2.4. Example use case.

Let us assume that:

  • [server_addres] is 'localhost',
  • [port] is 8080,

and we want to perform conversion of a file from TEI(text/xml) to TEI(application/msword) format using the default conversion profile.

To achieve our goal we have to :

  • Step 1 : list of data types - perform "list of data types" operation by sending HTTP request with GET method on URL : http://localhost:8080/ege-webapp/Conversions/ .

    In response we receive the following XML data :

    	<?xml version="1.0" encoding="UTF-8"?>
    	<input-data-types xmlns:xlink="http://www.w3.org/1999/xlink">
    		<input-data-type id="TEI,text/xml" xlink:href="http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/" />
    		<input-data-type id="MASTER,text/xml" xlink:href="http://localhost:8080/ege-webapp/Conversions/MASTER%3Atext%3Axml/" />
    		<input-data-type id="EAD,text/xml" xlink:href="http://localhost:8080/ege-webapp/Conversions/EAD%3Atext%3Axml/" />
    	</input-data-types>
    	

    Please note, that there is one node named 'input-data-type' which matches the input file data type - "TEI(text/xml)".

  • Step 2 : list conversions paths for the input data type which matched with our input file data type:

    From the previous response we have to select the 'xlink:href' attribute (http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/) of the 'TEI,text/xml' input data type and use it to obtain the conversion paths.

    In order to receive the list of conversion paths, we perform 'list conversion paths' operation of the EGE RESTful Web Service by sending HTTP request with GET method on the extracted URL : http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/.

  • Step 3 : handling "list of conversions paths" response.

    In the response we will receive the following XML data :

    	
    <?xml version="1.0" encoding="UTF-8"?>
    <conversions-paths xmlns:xlink="http://www.w3.org/1999/xlink">
    	<conversions-path xlink:href="http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/TEI%3Aapplication%3Amsword/" >
    		<path-name><![CDATA[ 
    			I:TEI,text/xml/O:TEI,application/msword(TEI Converter) 
    		]]></path-name>
    		<conversion index="0" >
    			<property id="pl.psnc.dl.ege.tei.profileNames">
    				<definition><![CDATA[default,enrich,iso, ]]></definition>
    				<type>array</type>
    				<property-name>Profile</property-name>
    			</property>
    		</conversion>
    	</conversions-path>
    </conversions-paths>
    	

    XML contains only one conversion path, represented by 'conversions-path' node that gives us the possibility to perform conversion from TEI(text/xml) format to TEI(application/msword) format.

  • Step 4 : setting conversion properties.

    XML response returned by the "list conversions paths" operation contains a list of conversions. In the above case there is only one conversion node. Each conversion node defines the properties for the conversion action it represents. In our example we have the 'pl.psnc.dl.ege.tei.profileNames' property with three possible values : default, enrich and iso. We will use 'default' value for the 'pl.psnc.dl.ege.tei.profileNames' parameter by setting the 'properties' parameter (for the 'convert' operation) to the following value:

    	<conversions>
    		<conversion index="0" >
    			<property id="pl.psnc.dl.ege.tei.profileNames">
    				default
    			</property>
    		</conversion>
    	</conversions>
    	
  • Step 5 : performing conversion.

    We have to perform POST request on the the following URL: http://localhost:8080/ege-webapp/Conversions/TEI%3Atext%3Axml/TEI%3Aapplication%3Amsword/ , with a filled request parameter 'properties' and an input file to convert.

    As a result we will receive the converted file (in the TEI(application/msword) format - '.docx' extension).

3. Validation

3.1. 'List input data types' operation

This operation lists all the data types which can be validated by EGE RESTful Web Service. The operation should be invoked using the GET method of the HTTP protocol. In order to invoke this operation, use the following URL (we assume that the EGE RESTful Web Service is installed under 'ege-webapp' path of the application/servlet container):

  • http://[server_address]:[port]/ege-webapp/Validation/

where [server_address] is the address of a server on which the EGE RESTful application is running, and the [port] is the port number of the server.

The following responses are possible:

  • status code 200 - when operation has been performed successfully and there are available data types for validation;
  • status code 204 - when there is no supported data types;
  • status code 405 - 'wrong method' error, when requested URL suggests validation operation.

For example, for the following request:

  • http://localhost:8080/ege-webapp/Validation/

the following XML is received as a response:

<?xml version="1.0" encoding="UTF-8"?>
<validations xmlns:xlink="http://www.w3.org/1999/xlink">
	<input-data-type id="TEI,text/xml" xlink:href="http://localhost:8080/ege-webapp/Validation/TEI%3Atext%3Axml/" />
	<input-data-type id="MASTER,text/xml" xlink:href="http://localhost:8080/ege-webapp/Validation/MASTER%3Atext%3Axml/" />
	<input-data-type id="EAD,text/xml" xlink:href="http://localhost:8080/ege-webapp/Validation/EAD%3Atext%3Axml/" />
</validations>
		

The response data includes a list of nodes describing each supported data type. The following attributes are available for each data type:

  • id - the name of the data type which should be used in further requests;
  • xlink:href - reference to the operation of validation.

3.2. 'Validation' operation

This operation performs the validation of a given file using the selected conversion data type. This operation can only be executed through the POST method of the HTTP protocol. To perform the validation of a file the client application has to send the POST request using the URL:

  • http://[server_address]:[port]/ege-webapp/Validation/[input data type]

where [input_data_type] is the input data type identifier enclosed in the response of the "list input data types" operation.

The request has to enclose the file for validation.

For the 'conversion' operation, the following responses are possible :

  • 200 status code - validation was performed without any problems and response XML was returned;
  • 400 status code - returned when either request was formed unproperly or provided data type is not supported by any of the EGE validators;

With code '200' of result client application receives validation result encoded in XML format.

For example, for the following request:

  • http://localhost:8080/ege-webapp/Validation/EAD%3Atext%3Axml

the following XML response is provided:

<?xml version="1.0" encoding="UTF-8"?>
	<validation-result>
		<status>ERROR</status>
		<messages>
			<message>
				<![CDATA[ 
					1) Error in line (3), column (8) : Document root element "TEI.2", must match DOCTYPE root "ead".
				]]>
			</message>
		</messages>
</validation-result>

Response data contains one "validation-result" section with :

  • status - which may take one of three values : ERROR - when validator found some errors, SUCCESS - when no errors were found (warnings possible), FATAL - when fatal errors occured;
  • messages - which contains multiple "message" sections, each with text values : error or warning messages.

4. Error response

During any operation an unexpected error may occur which will result in 500 status code of HTTP response, in that case application should also return xml data:

<?xml version="1.0" encoding="UTF-8"?>
<error msg="sample error message" exclass="com.my.SampleException">part of stack trace</error>

The response contains only one 'error' element with part of a stack trace as content and with the following attributes :

  • msg - which contains the exception message;
  • exclass - which contains the exception class.