tencent cloud

Feedback

Configuring Log Collection via Yaml

Last updated: 2023-03-14 18:19:11
    This document describes how to use CRD to configure the log collection feature of a TKE Serverless cluster via YAML.

    Prerequisites

    Log in to the TKE console, and enable the log collection feature for the serverless cluster. For more information, see Enabling Log Collection..

    Creating the CRD

    To create a collection configuration, you only need to define the LogConfig CRD. The collection component will modify the corresponding CLS log topics based on changes to the LogConfig CRD and set the bound server group. The CRD format is as follows:

    clsDetail Field Description

    Note:
    You cannot modify a specified topic.
    If the collection type is selected as "Container File Path", the corresponding path cannot be a soft link. Otherwise, the actual path of the soft link will not exist in the collector's container, resulting in log collection failure.
    clsDetail:
    ## If the log topic is created automatically, the names of logset and topic need to be specified at the same time.
    logsetName: test ## CLS logset name. Logset for the name will be created automatically if there is not any. If there is the logset, log topic will be created under it.
    topicName: test ## CLS log topic name. Log topic for the name will be created automatically if there is not any.
    
    # Select an existing logset and log topic. If the logset is specified but the log topic is not, a log topic will be created automatically.
    logsetId: xxxxxx-xx-xx-xx-xxxxxxxx ## The ID of the CLS logset. The logset needs to be created in advance in CLS.
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx ## CLS log topic ID. The log topic needs to be created in CLS in advance and should not be occupied by other collection configurations.
    
    logType: json_log ## Log collection format. json_log: json format. delimiter_log: separator-based format. minimalist_log: full text in a single line. multiline_log: full text in multi lines. fullregex_log: full regex format. Default value: minimalist_log
    logFormat: xxx ## Log formatting method
    period: 30 ## Lifecycle in days. Value range: 1–3600. `3640` indicates permanent storage.
    partitionCount: ## The number (an integer) of log topic partitions. Default value: `1`. Maximum value: `10`.
    tags: ## Tag description list. This parameter is used to bind a tag to a log topic. Up to nine tag key-value pairs are supported, and a resource can be bound to only one tag key.
    - key: xxx ## Tag key
    value: xxx ## Tag value
    autoSplit: false ## Whether to enable automatic split (Boolean type). Default value: `true`.
    maxSplitPartitions:
    storageType: hot ## Log topic storage class. Valid values: `hot` (STANDARD); `cold` (STANDARD_IA). Default value: `hot`.
    excludePaths: ## Collection path blocklist
    - type: File ## Type. Valid values: `File`, `Path`. 
    value: /xx/xx/xx/xx.log ## The value of `type`
    indexs: ## You can customize the indexing method and field when creating a topic.
    - indexName: ## When a key value or metafield index needs to be configured for a field, the metafield `Key` does not need to be prefixed with `__TAG__.` and is consistent with the one when logs are uploaded. `__TAG__.` will be prefixed automatically for display in the console.
    indexType: ## Field type. Valid values: `long`, `text`, `double`
    tokenizer: ## Field delimiter. Each character represents a delimiter. Only English symbols and \\n\\t\\r are supported. For `long` and `double` fields, leave it empty. For `text` fields, we recommend you use @&?|#()='",;:<>[]{}/ \\n\\t\\r\\ as the delimiter.
    sqlFlag: ## Whether the analysis feature is enabled for the field (Boolean)
    containZH: ## Whether Chinese characters are contained (Boolean)
    region: ap-xxx ## Topic region for cross-region shipping
    userDefineRule: xxxxxx ## Custom collection rule, which is a serialized JSON string
    extractRule: {} ## Extraction and filter rule. If `ExtractRule` is set, `LogType` must be set.
    

    inputDetail Field Description

    inputDetail:
    type: container_stdout ## Log collection type, including container_stdout (container standard output), container_file (container file), and host_file (host file)
    
    containerStdout: ## Container standard output
    namespace: default ## The Kubernetes namespace of the container to be collected. Separate multiple namespaces by comma, for example, `default,namespace`. If this field is not specified, it indicates all namespaces. Note that this field cannot be specified if `excludeNamespace` is specified.
    excludeNamespace: nm1,nm2 ## The Kubernetes namespace of the container to be excluded. Separate multiple namespaces by comma, for example, `nm1,nm2`. If this field is not specified, it indicates all namespaces. Note that this field cannot be specified if `namespace` is specified.
    nsLabelSelector: environment in (production),tier in (frontend) ## Filter namespaces by namespace label
    allContainers: false ## Whether to collect the standard output of all containers in the specified namespace. Note that if `allContainers=true`, you cannot specify `workload`, `includeLabels`, and `excludeLabels` at the same time.
    container: xxx ## Name of the container of which the logs will be collected. If the name is empty, it indicates the log names of all matching containers will be collected. 
    excludeLabels: ## Pods with the specified labels will be excluded. This field cannot be specified if `workload`, `namespace`, and `excludeNamespace` are specified.
    key2: value2 ## Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be excluded. Separate multiple values by comma. If you also specify `includeLabels`, Pods in the intersection will be matched.
    
    includeLabels: ## Pods with the specified labels will be collected. This field cannot be specified if `workload`, `namespace`, and `excludeNamespace` are specified.
    key: value1 ## The `metadata` will be carried in the log collected based on the collection rule and reported to the consumer. Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be matched. Separate multiple values by comma. If you also specify `excludeLabels`, Pods in the intersection will be matched.
    
    metadataLabels: ## Specify the Pod labels to be collected as the metadata. If this field is not specified, all Pod labels will be collected as the metadata.
    - label1
    customLabels: ## Custom metadata
    label: l1
    
    workloads:
    - container: xxx ## Name of the container to collect. If this parameter is not specified, it indicates all containers in the workload Pod will be collected.
    kind: deployment ## Workload type. Supported values include deployment, daemonset, statefulset, job, and cronjob.
    name: sample-app ## Workload name
    namespace: prod ## Workload namespace
    
    containerFile: ## File in the container
    namespace: default ## The Kubernetes namespace of the container to be collected. A namespace must be specified.
    excludeNamespace: nm1,nm2 ## The Kubernetes namespace of the container to be excluded. Separate multiple namespaces by comma, for example, `nm1,nm2`. If this field is not specified, it indicates all namespaces. Note that this field cannot be specified if `namespace` is specified.
    nsLabelSelector: environment in (production),tier in (frontend) ## Filter namespaces by namespace label
    container: xxx ## The name of container of which the logs will be collected. The * indicates the log names of all matching containers will be collected.
    logPath: /var/logs ## Log folder. Wildcards are not supported.
    filePattern: app_*.log ## Log file name. It supports the wildcards "*" and "?". "*" matches multiple random characters, and "?" matches a single random character.
    customLabels: ## Custom metadata
    key: value
    excludeLabels: ## Pods with the specified labels will be excluded. This field cannot be specified if `workload` is specified.
    key2: value2 ## Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be excluded. Separate multiple values by comma. If you also specify `includeLabels`, Pods in the intersection will be matched.
    
    includeLabels: ## Pods with the specified labels will be collected. This field cannot be specified if `workload` is specified.
    key: value1 ## The `metadata` will be carried in the log collected based on the collection rule and reported to the consumer. Pods with multiple values of the same key can be matched. For example, if you enter `environment = production,qa`, Pods with the `production` or `qa` value of the `environment` key will be matched. Separate multiple values by comma. If you also specify `excludeLabels`, Pods in the intersection will be matched.
    metadataLabels: ## Specify the Pod labels to be collected as the metadata. If this field is not specified, all Pod labels will be collected as the metadata.
    - label1 ## Pod label
    workload:
    container: xxx ## Name of the container to collect. If this parameter is not specified, it indicates all containers in the workload Pod will be collected.
    name: sample-app ## Workload name

    Log Parsing Format

    Full Text in a Single Line
    Full Text in Multi Lines
    Single line - full regex
    Multiple lines - full regex
    JSON Format
    Separator Format
    A log with full text in a single line means a full log occupies one line. When CLS collects logs, it uses \\n as a line break to end a log. For easier structural management, each log is provided with a default key value __CONTENT__. However, the log data itself is no longer structured, and the log fields are not extracted. The Time log attribute depends on the time the log is collected. For more information, see Full text in a single line.
    Assume that the raw data of a log is as follows:
    Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
    A sample of LogConfig configuration is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    clsDetail:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    # Single-line log
    logType: minimalist_log
    The data collected to CLS is as follows:
    __CONTENT__:Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
    A log with full text in multi lines means that a full log may occupy multiple lines (such as Java stacktrace). In this format, the line break \\n cannot be used as the end mark of a log. To help the CLS system distinguish among the logs, "First Line Regular Expression" is used for matching. When a log in a line matches the preset regular expression, it is considered to be the beginning of a log, and the next matching line will be the end mark of the log. In this format, a default key value __CONTENT__ is also set. However, the log data itself is no longer structured, and the log fields are not extracted. The Time log attribute depends on the time the log is collected. For more information, see Full text in multi lines.
    Assume that the raw data of a multi-line log is:
    2019-12-15 17:13:06,043 [main] ERROR com.test.logging.FooFactory:
    java.lang.NullPointerException
    at com.test.logging.FooFactory.createFoo(FooFactory.java:15)
    at com.test.logging.FooFactoryTest.test(FooFactoryTest.java:11)
    
    A sample of LogConfig is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    clsDetail:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    # Multi-line log
    logType: multiline_log
    extractRule:
    # Only a line that starts with a date time is considered the beginning of a new log. Otherwise, add the line break `\\n` to the end of the current log.
    beginningRegex: \\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3}\\s.+
    
    The data collected to CLS is as follows:
    \\_\\_CONTENT__:2019-12-15 17:13:06,043 [main] ERROR com.test.logging.FooFactory:\\njava.lang.NullPointerException\\n at com.test.logging.FooFactory.createFoo(FooFactory.java:15)\\n at com.test.logging.FooFactoryTest.test(FooFactoryTest.java:11)
    
    Full RegEx is often used to process structured logs. It parses a full log by extracting multiple key-value pairs based on a regex. For more information, see Full RegEx. Assume that the raw data of a log is as follows:
    10.135.46.111 - - [22/Jan/2019:19:19:30 +0800] "GET /my/course/1 HTTP/1.1" 127.0.0.1 200 782 9703 "http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0" 0.354 0.354
    
    A sample of LogConfig is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    clsDetail:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    # Full Regex
    logType: fullregex_log
    extractRule:
    # Regular expression, in which the corresponding values will be extracted based on the `()` capture groups
    logRegex: (\\S+)[^\\[]+(\\[[^:]+:\\d+:\\d+:\\d+\\s\\S+)\\s"(\\w+)\\s(\\S+)\\s([^"]+)"\\s(\\S+)\\s(\\d+)\\s(\\d+)\\s(\\d+)\\s"([^"]+)"\\s"([^"]+)"\\s+(\\S+)\\s(\\S+).*
    beginningRegex: (\\S+)[^\\[]+(\\[[^:]+:\\d+:\\d+:\\d+\\s\\S+)\\s"(\\w+)\\s(\\S+)\\s([^"]+)"\\s(\\S+)\\s(\\d+)\\s(\\d+)\\s(\\d+)\\s"([^"]+)"\\s"([^"]+)"\\s+(\\S+)\\s(\\S+).*
    # List of extracted keys, which are in one-to-one correspondence with the extracted values
    keys: ['remote_addr','time_local','request_method','request_url','http_protocol','http_host','status','request_length','body_bytes_sent','http_referer','http_user_agent','request_time','upstream_response_time']
    
    The data collected to CLS is as follows:
    body_bytes_sent: 9703
    http_host: 127.0.0.1
    http_protocol: HTTP/1.1
    http_referer: http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum
    http_user_agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0
    remote_addr: 10.135.46.111
    request_length: 782
    request_method: GET
    request_time: 0.354
    request_url: /my/course/1
    status: 200
    time_local: [22/Jan/2019:19:19:30 +0800]
    upstream_response_time: 0.354
    
    The multi-line - full regular expression mode is a log parsing mode where multiple key-value pairs can be extracted from a complete piece of log data that spans multiple lines in a log text file (such as Java program logs) based on a regular expression. If you don't need to extract key-value pairs, please configure it as instructed in full text in multi lines. For more information, see Full Text in Multi Lines.
    Assume that the raw data of a log is as follows:
    [2018-10-01T10:30:01,000] [INFO] java.lang.Exception: exception happened
    at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
    at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
    at TestPrintStackTrace.main(TestPrintStackTrace.java:16)
    
    A sample of LogConfig is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    clsDetail:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    # Multiple lines - full regex
    logType: multiline_fullregex_log
    extractRule:
    # The first-line full regular expression: only a line that starts with a date time is considered the beginning of a new log. Otherwise, add the line break `\\n` to the end of the current log.
    beginningRegex: \\[\\d+-\\d+-\\w+:\\d+:\\d+,\\d+\\]\\s\\[\\w+\\]\\s.*
    # Regular expression, in which the corresponding values will be extracted based on the `()` capture groups
    logRegex: \\[(\\d+-\\d+-\\w+:\\d+:\\d+,\\d+)\\]\\s\\[(\\w+)\\]\\s(.*)
    # List of extracted keys, which are in one-to-one correspondence with the extracted values
    keys:
    - time
    - level
    - msg
    
    Based on the extracted key, the data collected to CLS is as follows:
    time: 2018-10-01T10:30:01,000`
    level: INFO`
    msg: java.lang.Exception: exception happened
    at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
    at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
    at TestPrintStackTrace.main(TestPrintStackTrace.java:16)
    
    A JSON log automatically extracts the key at the first layer as the field name and the value at the first layer as the field value to structure the entire log. A full log ends with a line break \\n. For more information, see JSON Format.
    Assume the raw data of a JSON log is as follows:
    {"remote_ip":"10.135.46.111","time_local":"22/Jan/2019:19:19:34 +0800","body_sent":23,"responsetime":0.232,"upstreamtime":"0.232","upstreamhost":"unix:/tmp/php-cgi.sock","http_host":"127.0.0.1","method":"POST","url":"/event/dispatch","request":"POST /event/dispatch HTTP/1.1","xff":"-","referer":"http://127.0.0.1/my/course/4","agent":"Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0","response_code":"200"}
    
    A sample of LogConfig is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    clsDetail:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    # JSON log
    logType: json_log
    
    The data collected to CLS is as follows:
    agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0
    body_sent: 23
    http_host: 127.0.0.1
    method: POST
    referer: http://127.0.0.1/my/course/4
    remote_ip: 10.135.46.111
    request: POST /event/dispatch HTTP/1.1
    response_code: 200
    responsetime: 0.232
    time_local: 22/Jan/2019:19:19:34 +0800
    upstreamhost: unix:/tmp/php-cgi.sock
    upstreamtime: 0.232
    url: /event/dispatch
    xff: -
    
    In this format, a log is structured based on the specified separator, and each complete log ends with a line break \\n. You need to define a unique key for each separate field for CLS to process separator-based logs. For more information, see Separator-based Format.
    Assume the raw log is as follows:
    10.20.20.10 ::: [Tue Jan 22 14:49:45 CST 2019 +0800] ::: GET /online/sample HTTP/1.1 ::: 127.0.0.1 ::: 200 ::: 647 ::: 35 ::: http://127.0.0.1/
    
    A sample of LogConfig is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    clsDetail:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    # Separator log
    logType: delimiter_log
    extractRule:
    # Separator
    delimiter: ':::'
    # List of extracted keys, which are in one-to-one correspondence to the separated fields
    keys: ['IP','time','request','host','status','length','bytes','referer']
    
    The data collected to CLS is as follows:
    IP: 10.20.20.10
    bytes: 35
    host: 127.0.0.1
    length: 647
    referer: http://127.0.0.1/
    request: GET /online/sample HTTP/1.1
    status: 200
    time: [Tue Jan 22 14:49:45 CST 2019 +0800]

    Log Collection Types

    Container standard output

    Sample 1: collecting the standard output of all containers in the default namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    inputDetail:
    type: container_stdout
    containerStdout:
    namespace: default
    allContainers: true
    ...

    Sample 2: collecting the container standard output in the Pod that belongs to ingress-gateway deployment in the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    inputDetail:
    type: container_stdout
    containerStdout:
    allContainers: false
    workloads:
    - namespace: production
    name: ingress-gateway
    kind: deployment
    ...

    Sample 3: collecting the container standard output in the Pod whose Pod labels contain “k8s-app=nginx” under the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    inputDetail:
    type: container_stdout
    containerStdout:
    namespace: production
    allContainers: false
    includeLabels:
    k8s-app: nginx
    ...

    Container file

    Sample 1: collecting the access.log file in the /data/nginx/log/ path in the nginx container in the Pod that belongs to ingress-gateway deployment under the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    topicId: xxxxxx-xx-xx-xx-xxxxxxxx
    inputDetail:
    type: container_file
    containerFile:
    namespace: production
    workload:
    name: ingress-gateway
    type: deployment
    container: nginx
    logPath: /data/nginx/log
    filePattern: access.log
    ...

    Sample 2: collecting the access.log file in the /data/nginx/log/ path in the nginx container in the Pod whose pod labels contain “k8s-app=ingress-gateway” under the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
    inputDetail:
    type: container_file
    containerFile:
    namespace: production
    includeLabels:
    k8s-app: ingress-gateway
    container: nginx
    logPath: /data/nginx/log
    filePattern: access.log
    ...

    Metadata

    For container standard output (container_stdout) and container files (container_file), in addition to the raw log content, the container metadata (for example, the ID of the container that generated the logs) also needs to be carried and reported to CLS. In this way, when viewing logs, users can trace the log source or search based on the container identifier or characteristics (such as container name and labels).
    The following table lists the metadata:
    Field
    Description
    cluster_id
    ID of the cluster to which the log belongs
    container_name
    Name of the container to which the log belongs
    image_name
    Image name IP of the container to which the log belongs
    namespace
    Namespace of the Pod to which the log belongs
    pod_uid
    UID of the Pod to which the log belongs
    pod_name
    Name of the Pod to which the log belongs
    pod_ip
    IP of the Pod to which the log belongs
    pod_lable_{label name}
    Labels of the Pod to which the log belongs (for example, if a Pod has two labels: app=nginx and env=prod, the reported log will have two metadata entries attached: pod_label_app:nginx and pod_label_env:prod)
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support