Obtain a sense annotated document.
Resource URL
GET/POST http(s)://api.idilia.com/1/text/disambiguate.{format}
Description
Send a request to disambiguate a document. Can be used in synchronous or batch mode. In synchronous mode, the annotated document is returned in the response, either embedded in the response object or as a distinct part in multipart response. In batch mode, the request returns immediately and the annotated document is deposited at the URI specified once generated.
The text to disambiguate can either be provided as parameter text or as a distinct part in a multipart request. HTTP method POST is used when issuing a multipart request. Either GET or POST may be used for single part requests.
The response to an HTTP 1.1 request maintains the connection opened for 5 seconds. Another request can be issued on the same connection within this timeout.
Request
The request is an HTTP GET or POST with the following query parameters:
disambiguationRecipe |
Recipe identifier for the linguistic processing of the request. Please refer to Sense Analysis Recipes for available values. Not all values may be supported by a specific configuration. Optional. Default: defaultPrecision Example value : highestPrecision |
key |
When using simple authentication, this parameter contains the concatenation of your access key and private key. See Authentication for more details. Optional. Default: none Example value : Idi123rlzKT90yoUavrzoRbkdHgsoZTiYR2qaYA8SxA |
maxTokens |
Maximum number of tokens to process in the source text. The text is tokenized up to the next paragraph boundary after this number of tokens is reached. Only tokenized text is annotated and returned. Optional. Default: 1000 Example value: 2500 |
notificationURI |
URI in the form "sqs://queue" in Amazon Simple Queue Service (SQS) where a notification with the result of processing the request is delivered. This notification does not include the annotated document. Queue must be writable from Idilia's account (AWS Account: 933890641136). Optional. Only used in batch mode. Example value : sqs://queue.amazonaws.com/2345433344/idilia-notif |
requestId |
Unique identifier supplied by the application to assist correlating the server's responses with the application's requests. Echoed transparently by the server. Optional. Default: none Example value : some-app-req-1234 |
resultMime |
MIME type of the annotated text. For a list of the valid MIME types, refer to Annotated Document MIME Types. Optional. Default: application/x-semdoc+xml Example value : application/x-semdoc+xml |
resultURI |
URI for a location where the response can be deposited when available. It can be a URI in the form "ftp://site/dropbox/file" where an FTP server will accept anonymous uploads. Any valid CURL (http://curl.haxx.se/) URI can be specified. The URI can also be in the form "s3://bucket/file" where this corresponds to an Amazon Web Service S3 bucket with a write policy that allows writes from Idilia's account (AWS Account: 933890641136). The presence of this parameter triggers batch mode. Example value : ftp://somedomain.com/dropbox/request-10.json |
text |
Text to process. Must agree with the supplied textMime. Alternatively the text to process can be supplied as a separate MIME-attached part. Mandatory if no MIME-attached part. Example value : Benjamin Franklin lived in Philadelphia, Pa.
Tip: Better results will be obtained by not sending whole HTML pages
containing non-related text (e.g., sidebars, navigational menus,
etc.).
|
textMime |
MIME type of the supplied text when not supplying the text as an separate MIME-attached part. For a list of the recognized MIME types, refer to Source Document Text Mimes. Mandatory when using parameter text. Example value : text/plain; charset=utf8 |
timeout |
Timeout in units of tenths of a second for returning a result (i.e., a value of 10 returns after 1 second). If the scheduling delay + computation time exceeds this limit, the request aborts immediately and returns HTTP 504. Useful for real-time applications. Optional. Default: one hour. Example value : 10 |
Response
The format of the response is determined by the extension provided on the request URL. The following are supported:| Extension | Format |
|---|---|
| .json | A single part document containing a JSON object. Object key results is a JSON array containing one object with keys "data" and "mime". The value of "data" is string that contains the document in the format specified by request parameter resultMime. |
| .xml | A single part document containing an XML object. Element result contains the document in the format specified by request parameter resultMime. |
| .mpjson | A multipart document where the first part is a JSON object formatted as described above but without a result element. The second part is a document with the format specified by resultMime. |
| .mpxml | A multipart document where the first part is an XML object formatted as described above but without an result element. The second part is a document with the format specified by resultMime. |
The response elements are the following:
| errorMsg | This parameter is present when an error was encountered when processing the request (i.e., status != 200). It is a string containing a descriptive cause of the error encountered. |
| requestId | This parameter echoes the string provided as parameter requestId in the request. It can be used by applications to correlate requests and responses. |
| result | This element contains the annotated document and is present only in single part responses. Its format is determined by request parameter resultMime. |
| status | This parameter is a numeric return code for the request. The value corresponds to the HTTP return codes. For example, "200" indicates that the request was successfully processed. "400" is the code for a bad request. |
Request Authentification
All requests must be authenticated as described in Authentication. Two methods are available: simple and signed requests. The basic method requires the addition of the key parameter as described in this page. The signed requests require the addition of a few HTTP headers (Content-MD5, Date, Authorization).
Batch Mode
- A drop box for the response documents must be set up.
- The request message must contain parameter resultURI with a unique URI for each request.
- The server validates the requests and returns HTTP "accepted" (202) for valid ones.
- The server deposits at resultURI the same result as obtained in non-batch mode, i.e., a "Response" object with all the response elements.
- If the request also contains notificationURI, the server sends a message to that SQS queue. The message contains the same "Response" object but without the sense annotated document.
- A client that sends too many documents too quickly receives an HTTP "service unavailable" (503) response with header "Retry-After" containing a delay in seconds. The client should wait for this delay and then resume transmission. The value of the delay varies constantly and is based on the current load.
Example
http(s)://api.idilia.com/1/text/disambiguate.xml?
key="your project value"&
requestId=ac123&
text=paris,+tx&
textMime=text/query;+charset=utf8
<DisambiguateResponse>
<status>200</status>
<requestId>ac123</requestId>
<result mime="application/x-semdoc+xml">
<docs len="3" num="1">
<doc len="3">
<sensesInfo>
<sense csk="Paris/C91" fsk="Paris/N3" isne="1">
<desc>
Paris is a city located 98 miles (158 km) northeast of the Dallas-Fort Worth Metroplex in Lamar County, Texas, in the United States.
</desc>
<extRef>
<dm>wikipedia</dm>
<ref>24001</ref>
</extRef>
<neInfo><neT>location/N1</neT><neST>municipality/N1</neST></neInfo>
</sense>
<sense csk="Tx/C3" fsk="Tx/N3" isne="1">
<desc>
the second largest state; located in southwestern United States on the Gulf of Mexico.
</desc>
<extRef>
<dm>wikipedia</dm>
<ref>29810</ref>
</extRef>
<neInfo><neT>location/N1</neT><neST>state/N2</neST></neInfo>
</sense>
</sensesInfo>
<queryConf>
<confCorrectCoarsePresent>0.961</confCorrectCoarsePresent>
<confCorrectCoarseMostProbable>0.631</confCorrectCoarseMostProbable>
<confCorrectFinePresent>0.769</confCorrectFinePresent>
<confCorrectFineMostProbable>0.677</confCorrectFineMostProbable>
</queryConf>
<para len="3" so="0">
<sent len="3" so="0">
<frag cccfmp="0.882" cccmp="0.887" ccfmp="0.874" len="1" so="0" sol="1">
<txt>paris</txt>
<cs pb="1.000" pc="0.875" sk="Paris/C91" so="0">
<fs pb="1.000" pc="0.875" sk="Paris/N3" so="0"/>
</cs>
<dep c="0.791" dest="tx" destLc="noun" role="modifier" src="paris" srcLc="noun"/>
</frag>
<frag len="1" so="1">
<txt>,</txt>
<lc lc="punct"/>
</frag>
<frag cccfmp="0.930" cccmp="0.929" ccfmp="0.981" len="1" so="2">
<txt>tx</txt>
<cs pb="1.000" pc="0.883" sk="Tx/C3" so="2">
<fs pb="1.000" pc="0.883" sk="Tx/N3" so="2"/>
</cs>
<dep c="0.791" dest="paris" destLc="noun" role="head" src="tx" srcLc="noun"/>
</frag>
</sent>
</para>
</doc>
</docs>
</result>
</DisambiguateResponse>