• Home
  • Solutions
  • Pricing
  • Developer
  • Support
  • Company
  • Log In
  • Log In
  • Sign Up
  • Log Out

Home > Developer > Sense Analysis > API > text/disambiguate


text/disambiguate

Obtain a sense annotated document.

Resource URL

GET/POST http(s)://api.idilia.com/1/text/disambiguate.{format}

Description

Send a request to disambiguate a document. Can be used in synchronous or batch mode. In synchronous mode, the annotated document is returned in the response, either embedded in the response object or as a distinct part in multipart response. In batch mode, the request returns immediately and the annotated document is deposited at the URI specified once generated.

The text to disambiguate can either be provided as parameter text or as a distinct part in a multipart request. HTTP method POST is used when issuing a multipart request. Either GET or POST may be used for single part requests.

The response to an HTTP 1.1 request maintains the connection opened for 5 seconds. Another request can be issued on the same connection within this timeout.

Request

The request is an HTTP GET or POST with the following query parameters:

disambiguationRecipe

Recipe identifier for the linguistic processing of the request. Please refer to Sense Analysis Recipes for available values. Not all values may be supported by a specific configuration.

Optional. Default: defaultPrecision

Example value : highestPrecision

key

When using simple authentication, this parameter contains the concatenation of your access key and private key. See Authentication for more details.

Optional. Default: none

Example value : Idi123rlzKT90yoUavrzoRbkdHgsoZTiYR2qaYA8SxA

maxTokens

Maximum number of tokens to process in the source text. The text is tokenized up to the next paragraph boundary after this number of tokens is reached. Only tokenized text is annotated and returned.

Optional. Default: 1000

Example value: 2500

notificationURI

URI in the form "sqs://queue" in Amazon Simple Queue Service (SQS) where a notification with the result of processing the request is delivered. This notification does not include the annotated document. Queue must be writable from Idilia's account (AWS Account: 933890641136).

Optional. Only used in batch mode.

Example value : sqs://queue.amazonaws.com/2345433344/idilia-notif

requestId

Unique identifier supplied by the application to assist correlating the server's responses with the application's requests. Echoed transparently by the server.

Optional. Default: none

Example value : some-app-req-1234

resultMime

MIME type of the annotated text. For a list of the valid MIME types, refer to Annotated Document MIME Types.

Optional. Default: application/x-semdoc+xml

Example value : application/x-semdoc+xml

resultURI

URI for a location where the response can be deposited when available. It can be a URI in the form "ftp://site/dropbox/file" where an FTP server will accept anonymous uploads. Any valid CURL (http://curl.haxx.se/) URI can be specified. The URI can also be in the form "s3://bucket/file" where this corresponds to an Amazon Web Service S3 bucket with a write policy that allows writes from Idilia's account (AWS Account: 933890641136).

The presence of this parameter triggers batch mode.

Example value : ftp://somedomain.com/dropbox/request-10.json

text

Text to process. Must agree with the supplied textMime. Alternatively the text to process can be supplied as a separate MIME-attached part.

Mandatory if no MIME-attached part.

Example value : Benjamin Franklin lived in Philadelphia, Pa.

Tip: Better results will be obtained by not sending whole HTML pages containing non-related text (e.g., sidebars, navigational menus, etc.).

textMime

MIME type of the supplied text when not supplying the text as an separate MIME-attached part. For a list of the recognized MIME types, refer to Source Document Text Mimes.

Mandatory when using parameter text.

Example value : text/plain; charset=utf8

timeout

Timeout in units of tenths of a second for returning a result (i.e., a value of 10 returns after 1 second). If the scheduling delay + computation time exceeds this limit, the request aborts immediately and returns HTTP 504. Useful for real-time applications.

Optional. Default: one hour.

Example value : 10

Note: A multipart request must follow the format of RFC2045 and be of MIME multipart/mixed. The first part contains the request parameters in a MIME attachment of type application/x-www-form-urlencoded. The second part contains the document to process and must have one of the MIME types described in Source Document MIME Types. This part may be encoded with "gzip" (RFC1952). When this is the case, header "Content-Type: gzip" must be set on the document part.

Response

The format of the response is determined by the extension provided on the request URL. The following are supported:
Extension Format
.json A single part document containing a JSON object. Object key results is a JSON array containing one object with keys "data" and "mime". The value of "data" is string that contains the document in the format specified by request parameter resultMime.
.xml A single part document containing an XML object. Element result contains the document in the format specified by request parameter resultMime.
.mpjson A multipart document where the first part is a JSON object formatted as described above but without a result element. The second part is a document with the format specified by resultMime.
.mpxml A multipart document where the first part is an XML object formatted as described above but without an result element. The second part is a document with the format specified by resultMime.
Note: Multipart responses: A multipart response follows the format of RFC2045. The first part has MIME type application/json or application/xml and the second part has the MIME type requested using parameter resultMime.

The response elements are the following:

errorMsg This parameter is present when an error was encountered when processing the request (i.e., status != 200). It is a string containing a descriptive cause of the error encountered.
requestId This parameter echoes the string provided as parameter requestId in the request. It can be used by applications to correlate requests and responses.
result This element contains the annotated document and is present only in single part responses. Its format is determined by request parameter resultMime.
status This parameter is a numeric return code for the request. The value corresponds to the HTTP return codes. For example, "200" indicates that the request was successfully processed. "400" is the code for a bad request.

Request Authentification

All requests must be authenticated as described in Authentication. Two methods are available: simple and signed requests. The basic method requires the addition of the key parameter as described in this page. The signed requests require the addition of a few HTTP headers (Content-MD5, Date, Authorization).

Batch Mode

Batch mode is useful for applications where several documents must be processed. In this mode the client can quickly send all the requests over a single connection and the server uploads the results when available. It works like this:
  1. A drop box for the response documents must be set up.
  2. The request message must contain parameter resultURI with a unique URI for each request.
  3. The server validates the requests and returns HTTP "accepted" (202) for valid ones.
  4. The server deposits at resultURI the same result as obtained in non-batch mode, i.e., a "Response" object with all the response elements.
  5. If the request also contains notificationURI, the server sends a message to that SQS queue. The message contains the same "Response" object but without the sense annotated document.
  6. A client that sends too many documents too quickly receives an HTTP "service unavailable" (503) response with header "Retry-After" containing a delay in seconds. The client should wait for this delay and then resume transmission. The value of the delay varies constantly and is based on the current load.

Example

http(s)://api.idilia.com/1/text/disambiguate.xml?
  key="your project value"&
  requestId=ac123&
  text=paris,+tx&
  textMime=text/query;+charset=utf8
<DisambiguateResponse>
 <status>200</status>
 <requestId>ac123</requestId>
 <result mime="application/x-semdoc+xml">
  <docs len="3" num="1">
   <doc len="3">
    <sensesInfo>
     <sense csk="Paris/C91" fsk="Paris/N3" isne="1">
      <desc>
       Paris is a city located 98 miles (158 km) northeast of the Dallas-Fort Worth Metroplex in Lamar County, Texas, in the United States.
      </desc>
      <extRef>
       <dm>wikipedia</dm>
       <ref>24001</ref>
      </extRef>
      <neInfo><neT>location/N1</neT><neST>municipality/N1</neST></neInfo>
     </sense>
     <sense csk="Tx/C3" fsk="Tx/N3" isne="1">
      <desc>
       the second largest state; located in southwestern United States on the Gulf of Mexico.
      </desc>
      <extRef>
       <dm>wikipedia</dm>
       <ref>29810</ref>
      </extRef>
      <neInfo><neT>location/N1</neT><neST>state/N2</neST></neInfo>
     </sense>
    </sensesInfo>
    <queryConf>
     <confCorrectCoarsePresent>0.961</confCorrectCoarsePresent>
     <confCorrectCoarseMostProbable>0.631</confCorrectCoarseMostProbable>
     <confCorrectFinePresent>0.769</confCorrectFinePresent>
     <confCorrectFineMostProbable>0.677</confCorrectFineMostProbable>
    </queryConf>
    <para len="3" so="0">
     <sent len="3" so="0">
      <frag cccfmp="0.882" cccmp="0.887" ccfmp="0.874" len="1" so="0" sol="1">
       <txt>paris</txt>
       <cs pb="1.000" pc="0.875" sk="Paris/C91" so="0">
        <fs pb="1.000" pc="0.875" sk="Paris/N3" so="0"/>
       </cs>
       <dep c="0.791" dest="tx" destLc="noun" role="modifier" src="paris" srcLc="noun"/>
      </frag>
      <frag len="1" so="1">
       <txt>,</txt>
       <lc lc="punct"/>
      </frag>
      <frag cccfmp="0.930" cccmp="0.929" ccfmp="0.981" len="1" so="2">
       <txt>tx</txt>
       <cs pb="1.000" pc="0.883" sk="Tx/C3" so="2">
        <fs pb="1.000" pc="0.883" sk="Tx/N3" so="2"/>
       </cs>
       <dep c="0.791" dest="paris" destLc="noun" role="head" src="tx" srcLc="noun"/>
      </frag>
     </sent>
    </para>
   </doc>
  </docs>
 </result>
</DisambiguateResponse>
  • Sense Analysis Home
  • API
    • text/disambiguate
    • Semdoc XSD
  • Live
    • API Console
    • Sense Mapping Demo
  • Concepts
    • Sense Annotations Concepts
    • Using the Sense Analysis Service
    • Understanding the Semdoc Format
    • Understanding the Confidence Thresholds
    • Sense Analysis Recipes
    • Document Processing Scheduling
    • Language Graph Home

Solutions

  • Electronic Publishing
  • Social Media Monitoring and Filtering
  • Keyword Search
  • Advertising Solutions

Developer API

  • Sense Analysis
  • Semantic Matching
  • Language Graph
  • Query Rewriting
  • API Console
  • API Status

Demos

  • Language Graph Browser
  • Sense Mapping Demo
  • Twitter Filtering Demo
  • Keyword Expansion
  • Query Rewriting
  • Search Retargeting

Company

  • About Us
  • News
  • Blog
  • Technology
  • History
  • Contact Us

Copyright 2013 Idilia Inc.