Feature Description
General OCR (Optical Character Recognition) leverages cutting-edge deep learning technology to intelligently identify text content from images and convert it into editable text. It can be applied to various scenarios such as snapshot scanning, paper document digitization, and e-commerce ad moderation, significantly enhancing information processing efficiency.
Note:
This interface belongs to a GET request, uses a synchronous request method, and requires carrying a signature. For specific signature settings, please see Request Signature. Authorization Description
When using with a sub-account, the ci:CreateOCRJob permission is required. For details, see Cloud Infinite actions. Activating a Service
Using this feature requires enabling Cloud Infinite in advance and binding a bucket. For details, see Bind Bucket. Use Limits
When using this API, please confirm the relevant restrictions first. For details, see Usage Limits. Fee Description
This API is a paid service. The incurred fees will be charged by Cloud Infinite. For detailed billing instructions, see Content Recognition. Request
Request sample
Original image stored in COS:
GET /<ObjectKey>?ci-process=OCR&type=general&language-type=zh&ispdf=true&pdf-pagenumber=1&isword=false&enable-word-polygon=false HTTP/1.1
Host: <BucketName-APPID>.cos.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Original image from another link:
GET /?ci-process=OCR&detect-url=<detect-url>&type=general&language-type=zh&ispdf=true&pdf-pagenumber=1&isword=false&enable-word-polygon=false HTTP/1.1
Host: <BucketName-APPID>.cos.<Region>.myqcloud.com
Date: <GMT Date>
Authorization: <Auth String>
Request parameters
|
ObjectKey | object filename, for example: folder/document.jpg | String | No |
ci-process | Cloud Infinite processing capability, image OCR fixed as OCR | String | Yes |
detect-url | You can process any publicly accessible image link by filling in detect-url. When detect-url is not specified, the backend will default to processing ObjectKey. When detect-url is filled in, the backend will process the detect-url link, and there is no need to fill in ObjectKey. http://www.example.com/abc.jpg needs to be url-encoded, and the processed result is http%25253A%25252F%25252Fwww.example.com%25252Fabc.jpg
| String | No |
type | Recognition type for ocr, valid values are general, accurate, efficient, fast, handwriting general printed text recognition accurate print hand high-precision version efficient Simplified Edition Printed Text fast printed text high-speed version handwriting text recognition default value is general | String | No |
| Valid when type is general, indicates the language type for recognition Supports automatic language type recognition, simultaneously supports selected language types, default is Chinese-English mix (zh), supports text recognition mixed with English for various language types Valid values: Mixed Chinese and English zh_rare: supports English, digits, rare Chinese characters, traditional Chinese characters, and special symbols auto mix: mixed language jap: Japanese kor: Korean spa: Spanish fre: French ger: German por: Portuguese Create and bind a policy Query an instance Reset the access password of an instance may rus: Russian ita: Italian hol: Dutch swe: Swedish fin: Finnish Create and bind a policy Query an instance Reset the access password of an instance nor: Norwegian hun: Hungarian tha: Thai hi: Hindi Create and bind a policy Query an instance Reset the access password of an instance | String | No |
ispdf | Valid when type is general or fast. Indicates whether PDF recognition is enabled. Valid values are true and false. Default value is false. Once enabled, it can simultaneously support image and PDF recognition. | Boolean | No |
pdf-pagenumber | Valid when type is general or fast. Indicates the corresponding page number of the PDF page to be recognized. Only supports single page recognition for PDF. Valid when the uploaded file is PDF and the ispdf parameter value is true. Default value is 1. | Integer | No |
isword | Valid when type is general or accurate. Indicates whether to return character information after recognition. Valid values are true and false. Default is false. | Boolean | No |
enable-word-polygon | Valid when type is handwriting. Indicates whether to output four-point positioning coordinates for single characters. Valid values are true and false. Default is false. | Boolean | No |
Request header
Common Headers
Non-common Headers
This request has no special request header information.
Request body.
This request has no request body.
Response
Response Headers
Common Response Headers
This response contains common response headers. For details on common response headers, please refer to the Common Response Headers document. Special Response Headers
There are no special response headers for this response operation.
Response Body
The response body is returned as application/xml. An example including the complete node data is shown below:
<Response>
<TextDetections>
<DetectedText></DetectedText>
<Confidence></Confidence>
<Polygon>
<X></X>
<Y></Y>
</Polygon>
<ItemPolygon>
<X></X>
<Y></Y>
<Width></Width>
<Height></Height>
</ItemPolygon>
<Words>
<Confidence></Confidence>
<Character></Character>
<WordCoordPoint>
<WordCoordinate>
<X></X>
<Y></Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
</TextDetections>
<Language></Language>
<Angel></Angel>
<PdfPageSize></PdfPageSize>
<RequestId></RequestId>
</Response>
The data are as follows:
|
Response | None. | Container for saving results | Container |
The content of the Response
|
TextDetections | Response | Detected text information, including text line content, confidence degree, text line coordinate, and rotation corrected coordinate of text line | Container |
Language | Response | Detected language type. Currently supported language types refer to the parameter description of language-type. | String |
Angel | Response | Image rotation angle (angle system), with text's horizontal direction as 0°; clockwise is positive, counterclockwise is negative. | Float |
PdfPageSize | Response | When the image is a PDF, return the total number of pages of the PDF, default is 0. | Integer |
RequestId | Response | Unique request ID, returned for each request. RequestId is required for locating a problem. | String |
The content of the TextDetections node
|
DetectedText | TextDetections | Recognized text row content | String |
Confidence | TextDetections | Confidence degree 0 ~100 | Integer |
Polygon | TextDetections | text line coordinate, represented by four vertex coordinates Note: This field may return null, indicating no valid value is obtained. | Container |
ItemPolygon | TextDetections | pixel coordinates of the text line in the image after rotation correction, represented as (top-left x, top-left y, width, height) | Container |
Words | TextDetections | The recognized character information includes characters (including character Character and character confidence), and the supported recognition APIs: general, accurate | Container |
WordPolygon | TextDetections | Array of character coordinates, represented by four vertex coordinates. Note: This field may return null, indicating no valid value is obtained. Supported recognition types: handwriting | Container |
Content of the Polygon node
|
X | Polygon | horizontal coordinate | Integer |
Y | Polygon | vertical coordinate | Integer |
Content of the ItemPolygon node
|
X | ItemPolygon | top-left x | Integer |
Y | ItemPolygon | top-left y | Integer |
Width | ItemPolygon | width | Integer |
Height | ItemPolygon | height | Integer |
The content of the Words node
|
Confidence | Words | Confidence degree 0 ~100 | Integer |
Character | Words | Create and bind a policy Query an instance Reset the access password of an instance | String |
WordCoordPoint | Words | The four-point coordinates of the single character in the original image, supported recognition APIs: general, accurate | Container |
The content of the WordCoordPoint node
|
WordCoordinate | WordCoordPoint | The coordinates of the single character in the original image, represented by four vertex coordinates, starting from the top-left corner and returned clockwise | Container |
The content of the WordCoordinate node
|
X | WordCoordinate | horizontal coordinate | Integer |
Y | WordCoordinate | vertical coordinate | Integer |
The content of the WordPolygon node
|
LeftTop | WordPolygon | top-left corner coordinate | Container |
RightTop | WordPolygon | top-left corner coordinate | Container |
RightBottom | WordPolygon | top-left corner coordinate | Container |
LeftBottom | WordPolygon | top-left corner coordinate | Container |
Content of LeftTop node Content of RightTop node Content of RightBottom node Content of LeftBottom node
|
X | WordCoordinate | horizontal coordinate | Integer |
Y | WordCoordinate | vertical coordinate | Integer |
Error Codes
For common error messages, please refer to the Error Codes document. Examples
Use Template ID
Request
GET /<ObjectKey>?ci-process=OCR&type=general&language-type=zh&ispdf=true&isword=true HTTP/1.1
Authorization:q-sign-algorithm=sha1&q-ak=**********************************&q-sign-time=1497530202;1497610202&q-key-time=1497530202;1497610202&q-header-list=&q-url-param-list=&q-signature=**************************************
Host:bucket-1250000000.cos.ap-beijing.myqcloud.com
Response
HTTP/1.1 200 OK
Content-Type: application/xml
Content-Length: 414641
Date: Thu, 15 Jun 2017 12:37:29 GMT
Server: tencent-ci
x-cos-request-id: NTk0MjdmODlfMjQ4OGY3XzYzYzhfMjc=
<Response>
<Angel>359.99</Angel>
<Language>mix</Language>
<PdfPageSize>0</PdfPageSize>
<RequestId>NTk0MjdmODlfMjQ4OGY3XzYzYzhfMjc=</RequestId>
<TextDetections>
<Confidence>99</Confidence>
<DetectedText>Hello</DetectedText>
<ItemPolygon>
<Height>64</Height>
<Width>123</Width>
<X>140</X>
<Y>167</Y>
</ItemPolygon>
<Polygon>
<X>140</X>
<Y>167</Y>
</Polygon>
<Polygon>
<X>263</X>
<Y>167</Y>
</Polygon>
<Polygon>
<X>263</X>
<Y>231</Y>
</Polygon>
<Polygon>
<X>140</X>
<Y>231</Y>
</Polygon>
<Words>
<Character>You</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>212</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>341</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>341</X>
<Y>231</Y>
</WordCoordinate>
<WordCoordinate>
<X>212</X>
<Y>231</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
<Words>
<Character>Good</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>341</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>263</X>
<Y>167</Y>
</WordCoordinate>
<WordCoordinate>
<X>263</X>
<Y>231</Y>
</WordCoordinate>
<WordCoordinate>
<X>341</X>
<Y>230</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
</TextDetections>
<TextDetections>
<Confidence>99</Confidence>
<DetectedText>Goodbye</DetectedText>
<ItemPolygon>
<Height>43</Height>
<Width>245</Width>
<X>526</X>
<Y>1444</Y>
</ItemPolygon>
<Polygon>
<X>526</X>
<Y>1444</Y>
</Polygon>
<Polygon>
<X>771</X>
<Y>1444</Y>
</Polygon>
<Polygon>
<X>771</X>
<Y>1487</Y>
</Polygon>
<Polygon>
<X>526</X>
<Y>1487</Y>
</Polygon>
<Words>
<Character>Again</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>564</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>608</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>608</X>
<Y>1487</Y>
</WordCoordinate>
<WordCoordinate>
<X>564</X>
<Y>1487</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
<Words>
<Character>See</Character>
<Confidence>99</Confidence>
<WordCoordPoint>
<WordCoordinate>
<X>608</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>641</X>
<Y>1444</Y>
</WordCoordinate>
<WordCoordinate>
<X>641</X>
<Y>1487</Y>
</WordCoordinate>
<WordCoordinate>
<X>608</X>
<Y>1487</Y>
</WordCoordinate>
</WordCoordPoint>
</Words>
</TextDetections>
</Response>