tencent cloud

Feedback

Combined Parsing Format

Last updated: 2024-01-20 17:14:28

    Overview

    If your log structure is too complex and involves multiple log parsing modes, and a single parsing mode (such as the NGINX mode, full regex mode, or JSON mode) cannot meet log parsing requirements, you can use LogListener to parse logs in combined parsing mode. You can enter code (in JSON format) in the console to define the pipeline logic for log parsing. You can add one or more LogListener plugins to process configurations, and the LogListener plugins are executed in the configuration processing order.

    Prerequisites

    Assume that the raw data of a log is as follows:
    1571394459,http://127.0.0.1/my/course/4|10.135.46.111|200,status:DEAD,
    The content of a custom plugin is as follows:
    {
    "processors": [
    {
    "type": "processor_split_delimiter",
    "detail": {
    "Delimiter": ",",
    "ExtractKeys": [ "time", "msg1","msg2"]
    },
    "processors": [
    {
    "type": "processor_timeformat",
    "detail": {
    "KeepSource": true,
    "TimeFormat": "%s",
    "SourceKey": "time"
    }
    },
    {
    "type": "processor_split_delimiter",
    "detail": {
    "KeepSource": false,
    "Delimiter": "|",
    "SourceKey": "msg1",
    "ExtractKeys": [ "submsg1","submsg2","submsg3"]
    },
    "processors": []
    },
    {
    "type": "processor_split_key_value",
    "detail": {
    "KeepSource": false,
    "Delimiter": ":",
    "SourceKey": "msg2"
    }
    }
    ]
    }
    ]
    }
    After being structured by CLS, the log is changed to the following:
    time: 1571394459
    submsg1: http://127.0.0.1/my/course/4
    submsg2: 10.135.46.111
    submsg3: 200
    status: DEAD

    Configuration Instructions

    Custom plugin types

    Plugin Feature
    Plugin Name
    Feature Description
    Field extraction
    processor_log_string
    Performs multi-character (line breaks) parsing of fields, typically for single-line logs.
    Field extraction
    processor_multiline
    Performs first-line regex parsing of fields (full regex mode), typically for multi-line logs.
    Field extraction
    processor_multiline_fullregex
    Performs first-line regex parsing of fields (full regex mode), typically for multi-line logs; extracts regexes from multi-line logs.
    Field extraction
    processor_fullregex
    Extracts fields (full regex mode) from single-line logs.
    Field extraction
    processor_json
    Expands field values in JSON format.
    Field extraction
    processor_split_delimiter
    Extracts fields (single-/multi-character separator mode).
    Field extraction
    processor_split_key_value
    Extracts fields (key-value pair mode).
    Field processing
    processor_drop
    Discards fields.
    Field processing
    processor_timeformat
    Parses time fields in raw logs to convert time formats and set parsing results as log time.

    Custom plugin parameters

    Plugin Name
    Support Subitem Parsing
    Plugin Parameter
    Required
    Feature Description
    processor_multiline
    No
    BeginRegex
    Yes
    Defines the first-line matching regex for multi-line logs.
    processor_multiline_fullregex
    Yes
    BeginRegex
    Yes
    Defines the first-line matching regex for multi-line logs.
    ExtractRegex
    Yes
    Defines the extraction regex after multi-line logs are extracted.
    ExtractKeys
    Yes
    Defines the extraction keys.
    processor_fullregex
    Yes
    ExtractRegex
    Yes
    Defines the extraction regex.
    ExtractKeys
    Yes
    Defines the extraction keys.
    processor_json
    Yes
    SourceKey
    No
    Defines the name of the upper-level processor key processed by the current processor.
    KeepSource
    No
    Defines whether to retain `SourceKey` in the final key name.
    processor_split_delimiter
    Yes
    SourceKey
    No
    Defines the name of the upper-level processor key processed by the current processor.
    KeepSource
    No
    Defines whether to retain `SourceKey` in the final key name.
    Delimiter
    Yes
    Defines the separator (single or multiple characters).
    ExtractKeys
    Yes
    Defines the extraction keys after separator splitting.
    processor_split_key_value
    No
    SourceKey
    No
    Defines the name of the upper-level processor key processed by the current processor.
    KeepSource
    No
    Defines whether to retain `SourceKey` in the final key name.
    Delimiter
    Yes
    Defines the separator between the `Key` and `Value` in a string.
    processor_drop
    No
    SourceKey
    Yes
    Defines the name of the upper-level processor key processed by the current processor.
    processor_timeformat
    No
    SourceKey
    Yes
    Defines the name of the upper-level processor key processed by the current processor.
    TimeFormat
    Yes
    Defines the time parsing format for the `SourceKey` value (time data string in logs).

    Directions

    Logging in to the console

    1. Log in to the CLS console.
    2. On the left sidebar, click Log Topic to go to the log topic management page.

    Creating a log topic

    1. Click Create Log Topic.
    2. In the pop-up dialog box, enter define-log as Log Topic Name and click Confirm.

    Managing the machine group

    1. After the log topic is created successfully, click its name to go to the log topic management page.
    2. Click the Collection Configuration tab, click Add in the LogListener Collection Configuration area, and select the format in which you need to collect logs.
    3. On the Machine Group Management page, select the machine group to which to bind the current log topic and click Next to proceed to collection configuration. For more information, see Machine Group Management.

    Configuring collection

    Configuring the log file collection path

    On the Collection Configuration page, set Collection Path according to the log collection path format. Log collection path format: [directory prefix expression]/**/[filename expression].
    After the log collection path is entered, LogListener will match all common prefix paths that meet the [directory prefix expression] rule and listen for all log files in the directories (including subdirectories) that meet the [filename expression] rule. The parameters are as detailed below:
    Parameter
    Description
    Directory Prefix
    Directory prefix for log files, which supports only the wildcard characters * and ?.
    \\* indicates to match any multiple characters.
    ? indicates to match any single character.
    /**/
    Current directory and all its subdirectories.
    File Name
    Log file name, which supports only the wildcard characters * and ?.
    \\* indicates to match any multiple characters.
    ? indicates to match any single character.
    Common configuration modes are as follows:
    [Common directory prefix]/**/[common filename prefix]*
    [Common directory prefix]/**/*[common filename suffix]
    [Common directory prefix]/**/[common filename prefix]*[common filename suffix]
    [Common directory prefix]/**/*[common string]*
    Below are examples:
    No.
    Directory Prefix Expression
    Filename Expression
    Description
    1.
    /var/log/nginx
    access.log
    In this example, the log path is configured as /var/log/nginx/**/access.log. LogListener will listen for log files named access.log in all subdirectories in the /var/log/nginx prefix path.
    2.
    /var/log/nginx
    *.log
    In this example, the log path is configured as /var/log/nginx/**/*.log. LogListener will listen for log files suffixed with .log in all subdirectories in the /var/log/nginx prefix path.
    3.
    /var/log/nginx
    error*
    In this example, the log path is configured as /var/log/nginx/**/error*. LogListener will listen for log files prefixed with error in all subdirectories in the /var/log/nginx prefix path.
    Note:
    Only LogListener 2.3.9 and later support adding multiple collection paths.
    The system does not support uploading logs with contents in multiple text formats, which may cause write failures, such as key:"{"substream":XXX}".
    We recommend you configure the collection path as log/*.log and rename the old file after log rotation as log/*.log.xxxx.
    By default, a log file can only be collected by one log topic. If you want to have multiple collection configurations for the same file, add a soft link to the source file and add it to another collection configuration.

    Configuring the combined parsing mode

    On the Collection Configuration page, select Combined Parsing as the Extraction Mode.

    Configuring the collection policy

    Full collection: When LogListener collects a file, it starts reading data from the beginning of the file.
    Incremental collection: When LogListener collects a file, it collects only the newly added content in the file.

    Use Limits

    If the combined parsing mode is used for data parsing, LogListener will consume more resources. We recommend you not use overly complex plug-in combinations to process data.
    If the combined parsing mode is used, the collection and filter features of the text mode will become invalid, but some of these features can be implemented through relevant user-defined plug-ins.
    If the combined parsing mode is used, the feature of uploading logs that fail to be parsed is enabled by default. For logs that fail to be parsed, the input name is the Key and the original log content is the Value for log uploading.
    1. Log in to the CLS console.
    2. On the left sidebar, click Search and Analysis to go to the search and analysis page.
    3. Select the region, logset, and log topic as needed, and click Search and Analysis to search for logs according to the set query rules.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support