tencent cloud

Feedback

Extracting Object Content

Last updated: 2022-08-30 16:53:11

    Overview

    This document provides an overview of APIs and SDK code samples for object content extraction.

    API Operation Description
    SELECT Object content Extracting object content Extracts content from a specified object.

    Extracting Object Content

    Feature description

    This API is used to extract content from a specific object.

    Method prototype

    select_object_content(Bucket, Key, Expression, ExpressionType, InputSerialization, OutputSerialization, RequestProgress=None, **kwargs)
    

    Sample request

    # -*- coding=utf-8
    from qcloud_cos import CosConfig
    from qcloud_cos import CosS3Client
    import sys
    import logging
    # In most cases, set the log level to INFO. If you need to debug, you can set it to DEBUG and the SDK will print the communication information of the client.
    logging.basicConfig(level=logging.INFO, stream=sys.stdout)
    # 1. Set user attributes such as `secret_id`, `secret_key`, and `region`. `Appid` has been removed from CosConfig and thus needs to be specified in `Bucket`, which is in the format of `BucketName-Appid`.
    secret_id = 'SecretId'     # Replace it with the actual SecretId, which can be viewed and managed at https://console.tencentcloud.com/cam/capi.
    secret_key = 'SecretKey'     # Replace it with the actual SecretKey, which can be viewed and managed at https://console.tencentcloud.com/cam/capi.
    region = 'ap-beijing'      # Replace it with the actual region, which can be viewed in the console at https://console.tencentcloud.com/cos5/bucket.
                             # For the list of regions supported by COS, visit https://www.tencentcloud.com/document/product/436/6224?from_cn_redirect=1.
    token = None               # Token is required for temporary keys but not permanent keys. For more information on how to generate and use a temporary key, visit https://www.tencentcloud.com/document/product/436/14048?from_cn_redirect=1.
    scheme = 'https'           # Specify whether to use HTTP or HTTPS protocol to access COS. This field is optional and is `https` by default.
    config = CosConfig(Region=region, SecretId=secret_id, SecretKey=secret_key, Token=token, Scheme=scheme)
    client = CosS3Client(config)
    response = client.select_object_content(
      Bucket='examplebucket-1250000000',
      Key='exampleobject',
      Expression='Select * from COSObject',
      ExpressionType='SQL',
      InputSerialization={
          'CompressionType': 'NONE',
          'JSON': {
              'Type': 'LINES'
          }
      },
      OutputSerialization={
          'CSV': {
              'RecordDelimiter': '\n'
          }
      }
    )
    # Get the `EventStream` instance encapsulated in the response
    event_stream = response['Payload']
    # Get all extraction results at once
    # Note that as `EventStream` fetches extraction results in a streaming manner, when you call the `get_select_result()` method again, an empty set will be returned.
    result = event_stream.get_select_result()
    print(result)
    

    Sample request with all parameters

    response = client.select_object_content(
      Bucket='examplebucket-1250000000',
      Key='exampleobject',
      Expression='Select * from COSObject',
      ExpressionType='SQL',
      InputSerialization={
          'CompressionType': 'GZIP',
          'JSON': {
              'Type': 'LINES'
          }
      },
      OutputSerialization={
          'CSV': {
              'RecordDelimiter': '\n'
          }
      },
      RequestProgress={
          'Enabled': 'FALSE'
      }
    )
    

    Parameter description

    Parameter Description Type Required
    Bucket Bucket name in the format of BucketName-APPID. String Yes
    Key Object key, which uniquely identifies an object in a bucket. For example, if an object's access endpoint is examplebucket-1250000000.cos.ap-guangzhou.myqcloud.com/doc/pic.jpg, its key is doc/pic.jpg. String Yes
    Expression SQL statement, which represents the extract operation you want to perform. String Yes
    ExpressionType Statement type, which is an extension. Currently, only SQL statements and parameters are supported. String Yes
    InputSerialization Format of the object to extract. For more information, see SELECT Object Content. Dict Yes
    OutputSerialization Output format of the extraction results. For more information, see SELECT Object Content. Dict Yes
    RequestProgress Whether to return the query progress (QueryProgress). If this feature is enabled, COS Select will return the query progress periodically. Dict No

    Response description

    The extraction result is in dict format.

    {
      'Payload': EventStream()
    }
    

    In the response, there is only one key-value pair where the key is 'Payload' and the value is the EventStream instance. The extraction result of the object is encapsulated in the EventStream instance. You can call the next_event(), get_select_result(), and get_select_result_to_file() methods to get the extraction result.

    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support