Scenarios
The log collection feature is a tool provided by Tencent Kubernetes Engine (TKE) for users to collect logs in the cluster. It supports sending logs from services in a cluster or specific path files on cluster nodes to Cloud Log Service (CLS) or TDMQ for CKafka (CKafka). This feature is suitable for users who need to store and analyze service logs in Kubernetes clusters. The log collection feature requires manually enabling and configuring collection rules for each cluster. After the feature is enabled, the log collection agent runs as a DaemonSet in the cluster. It collects logs from specified sources according to the collection sources, CLS log topics, and log parsing methods configured in log collection rules, and then sends the log content to the log consumer. For specific steps to enable this feature, see the documentation. Prerequisites
Ensure that there are sufficient resources on the cluster nodes before you enable the log collection feature. Enabling the feature occupies certain resources of your cluster.
CPU resources occupied: 0.11–1.1 cores. If the log volume is too large, you can increase it according to the situation.
Memory resources occupied: 24–560 MB. If the log volume is too large, you can increase it according to the situation.
Log length limit: 512 KB per log. If a log exceeds the limit, it will be truncated.
To use the log collection feature, ensure that nodes in the Kubernetes cluster can access the log consumer. Additionally, this log collection feature only supports Kubernetes clusters of version 1.10 and later.
Concept
Log collection agent: TKE is an agent used to collect log information. It uses LogListener and runs as a DaemonSet in a cluster.
Log rule: Users can use log rules to specify log collection sources, log topics, log parsing methods, and configure filters.
The log collection agent monitors changes to log collection rules, and the changed rules take effect within up to 10 seconds.
Multiple log collection rules will not lead to the creation of multiple DaemonSets, but too many log collection rules will increase the resources occupied by the log collection agent.
Log source: It contains the container stdout, container files, and node files of the specified containers.
When collecting stdout logs from containers, users can select all containers or TKE logs within a specified workload and specified Pod Labels as the log collection source.
When collecting container file path logs, users can specify the file path logs of the container in a workload or Pod Labels as the collection source.
When collecting node file path logs, users can set the log collection source to node file path logs.
Consumer: The user selects a CLS CKafka topic as the consumer.
Operation Steps
Enabling Log Collection
2. In the left sidebar, select Cluster, and click the ID of the target cluster to go to the basic cluster information page.
3. In the left sidebar, select the Log tab, and enable log collection in the upper-right corner of the Business Log module.
Configuring a Log Rule
1. At the top of the log collection page, select Business Logs and click Go to Configuration.
2. Configure a CLS consumer on the Create Log Collection Rule page. On the Consumption End > Type page, select Kafka.
Users can choose to write to CKafka or a self-built Kafka. If they choose to write to CKafka, they need to enter an instance ID and a topic. If they choose to write to a self-built Kafka, they need to enter a broker address and topic as required.
Note:
If the Kafka instance is not in the same Virtual Private Cloud (VPC) network as the node, you will be prompted to create an access point for the Kafka instance before log shipping.
In the daemonSet resource of the cluster, you can query the logs of the Kafka collector by selecting a kube-system namespace and locating the kafkalistener container under the tke-log-agent Pod.
You can ship logs to the specified partition by specifying a key value in Advanced Settings. This feature is disabled by default, with logs shipped randomly. After you enable this feature, logs with the same key value will be shipped to the same partition. You can enter TimestampKey (default: @timestamp) and specify the timestamp format.
3. On the Create Log Collection Rule page, select a collection type and configure the log source. Currently supported collection types include Container Stdout, Container File Path, and Node File Path.
Set the collection type to Container Stdout and configure the log source as needed. This log source type supports selecting workloads from multiple namespaces at a time.
Set the collection type to Container File Path and configure the log source.
The collection file path supports both file paths and wildcard rules. For example, if the container file path is /opt/logs/*.log, you can specify the collection path as /opt/logs and the file name as *.log.
Note
The container file path cannot be a symbolic link or a hard link. Otherwise, the actual path of the symbolic link will not exist in the container of the collector, resulting in a log collection failure.
Set the collection type to Node File Path.
The collection path supports being filled in using file paths and wildcard rules. For example, when collecting file paths such as /opt/logs/service1/*.log or /opt/logs/service2/*.log, you can specify the collection path folder as /opt/logs/service* and the file name as *.log.
You can customize and add metadata in key-value format based on actual needs. The metadata will be added to log records.
Note:
The node file path cannot be a symbolic link or a hard link. Otherwise, the actual path of the symbolic link will not exist in the collector, resulting in a log collection failure.
A node log file can only be collected by one log topic.
Note:
Ensure that the label of the Pod is configured correctly. The label should be the label of the Pod, instead of the label of the workload. You can query by following the steps below:
1.1 Log in to the TKE console. In the left sidebar, select Cluster. 1.2 On the cluster management page, select the ID of a cluster to go to the basic cluster information page.
1.3 In Workload, select the name of a specific workload to go to the workload details page.
1.4 In the YAML tab, view the labels under template, as shown below:
For container stdout and container files (excluding node file paths, namely hostPath mounts), in addition to the original log content, metadata related to containers or Kubernetes (such as the IDs of containers that generate the logs) will also be reported to CLS. This allows users to trace the source when viewing logs or perform searches based on container identifiers and characteristics (such as container names or labels).
For metadata related to containers or Kubernetes, see the table below:
|
container_id | ID of the container to which the log belongs. |
container_name | Name of the container to which the log belongs. |
image_name | Image name/IP address of the container to which the log belongs. |
namespace | Namespace of the Pod to which the log belongs. |
pod_uid | UID of the Pod to which the log belongs. |
pod_name | Name of the Pod to which the log belongs. |
pod_lable_{label name} | Label of the Pod to which the log belongs. For example, if a Pod has two labels, app=nginx and env=prod, the uploaded log will be accompanied by two metadata entries, pod_label_app:nginx and pod_label_env:prod. |
4. Configure a collection policy. You can select Full or Incremental.
Full: Full collection collects data from the beginning of the log file.
Incremental: Incremental collection starts collecting data from 1 MB before the end of the log file (if the log file is smaller than 1 MB, it is equivalent to full collection).
Note:
Full collection scenarios: delay of collection rules taking effect. To ensure that all critical logs during Pod startup are collected, configure a full collection policy. This policy indiscriminately collects all log data before collection rules take full effect, effectively preventing log loss.
Incremental collection scenarios: Cloud Block Storage (CBS) as the collection path: When the collection path is CBS, it is recommended to configure an incremental collection policy to prevent duplicate collection caused by Pod recreation.
Abnormal scenarios: Cloud File Storage (CFS) as the collection path: Configuring full collection may cause duplicate collection. Besides, multiple Pods may report identical logs simultaneously, resulting in duplicate billing for CLS. It is recommended that CFS not be used as the log source.
5. Click Next to select the log parsing method.
Encoding mode: UTF-8 and GBK are supported.
Extraction mode: Multiple types of extraction modes are supported. The details are as follows:
|
Single-line full-text | A log contains only one line of content. The line break \\n is used as the end marker for a log. Each log is parsed into a line of complete strings with the key-value being CONTENT. After the index is enabled, log content can be searched through full-text search. The log time is determined by the collection time. | |
Multi-line full-text | It refers to a complete log that spans multiple lines. It uses a first-line regular expression for matching. When a line matches the preset regular expression, it is considered the start of a log. The beginning of the next line serves as the end marker for that log. A default key-value CONTENT is also set. The log time is determined by the collection time. Regular expressions can be generated automatically. | |
Single-line full regular expression | It refers to a log parsing mode in which multiple key-value pairs are extracted from a complete log using regular expressions. You need to enter a log sample first and then enter a custom regular expression. The system will extract corresponding key-value pairs based on capture groups in the regular expression. Regular expressions can be generated automatically. | |
Multi-line full regular expression | It is suitable for log text where a complete log entry spans multiple lines (such as Java program logs). It refers to a log parsing mode in which multiple key-value pairs are extracted using regular expressions. You need to enter a log sample first and then enter a custom regular expression. The system will extract corresponding key-value pairs based on capture groups in the regular expression. Regular expressions can be generated automatically. | |
JSON | Logs in JSON format automatically extract the top-level key as the corresponding field name and the top-level value as the corresponding field value. Through this approach, the entire log entry is structured, with the line break \\n as the end marker for each complete log. | |
Delimiter | Specified delimiters can be used by a piece of log data to structure the entire log. The line break \\n is used as the end marker for each complete log. When CLS processes logs in delimiter format, you need to define a unique key for each separated field. For invalid fields (fields that do not require collection), you may leave them blank. However, you cannot leave all fields blank. | |
Combined parsing | When your log structure is too complex and involves multiple parsing modes, and a single parsing mode (such as Nginx mode, full regular expression mode, or JSON mode) cannot meet log parsing requirements, you can use the Loglistener combined parsing format. This mode allows users to input code (in JSON format) in the console to define the pipeline logic for log parsing. You can add one or more Loglistener plugin processing configurations, and Loglistener will execute them one by one according to the processing configuration order. | |
Filter: LogListener only collects logs that meet filter rules. Keys support exact matching, and filtering rules support regular expression matching. For example, you can configure the filter to only collect logs where ErrorCode is set to 404. You can enable the filter and configure rules as needed.
Note:
A single log topic currently supports only one collection configuration. Ensure that logs from all containers using this topic can be parsed with the selected method. If you create a different collection configuration under the same log topic, the old collection configuration is overwritten.
6. Click Complete to complete the creation of the log collection rule for shipping to CKafka.