Collecting logs is easy. Analyzing them is where the value lives. If you're sitting on gigabytes of OpenClaw DingTalk bot logs and only ever grep for "ERROR", you're leaving insights on the table.
Let's turn your raw logs into actionable intelligence.
Every log line from your OpenClaw DingTalk bot contains a story. The trick is knowing which stories matter:
Before you can analyze, you need structured data. Configure JSON logging on your Tencent Cloud Lighthouse instance:
# /opt/clawdbot/config/dingtalk.yaml
logging:
format: json
level: info
output: /var/log/clawdbot/output.log
fields:
- timestamp
- level
- event_type
- user_id
- group_id
- model
- tokens_input
- tokens_output
- response_time_ms
- skill_used
- status
This gives you log lines like:
{"timestamp":"2026-03-06T10:23:45Z","level":"info","event_type":"msg_processed","user_id":"dingtalk_user_123","group_id":"group_eng_456","model":"claude-sonnet","tokens_input":150,"tokens_output":320,"response_time_ms":1847,"skill_used":"code-review","status":"success"}
Build a comprehensive daily analysis:
#!/bin/bash
# /opt/clawdbot/daily-analysis.sh
LOG="/var/log/clawdbot/output.log"
TODAY=$(date +%Y-%m-%d)
echo "========================================="
echo " DingTalk Bot Daily Analysis: $TODAY"
echo "========================================="
echo ""
echo "--- Traffic ---"
echo "Total messages: $(grep "$TODAY" "$LOG" | grep -c 'msg_processed')"
echo "Unique users: $(grep "$TODAY" "$LOG" | jq -r '.user_id' 2>/dev/null | sort -u | wc -l)"
echo "Active groups: $(grep "$TODAY" "$LOG" | jq -r '.group_id' 2>/dev/null | sort -u | wc -l)"
echo ""
echo "--- Performance ---"
echo "Avg response time: $(grep "$TODAY" "$LOG" | jq -r '.response_time_ms' 2>/dev/null | awk '{sum+=$1;n++} END{print sum/n "ms"}')"
echo "P95 response time: $(grep "$TODAY" "$LOG" | jq -r '.response_time_ms' 2>/dev/null | sort -n | awk 'NR==int(NR*0.95){print $1 "ms"}')"
echo "Timeouts: $(grep "$TODAY" "$LOG" | grep -c 'timeout')"
echo ""
echo "--- Token Usage ---"
TOTAL_IN=$(grep "$TODAY" "$LOG" | jq -r '.tokens_input' 2>/dev/null | awk '{sum+=$1} END{print sum}')
TOTAL_OUT=$(grep "$TODAY" "$LOG" | jq -r '.tokens_output' 2>/dev/null | awk '{sum+=$1} END{print sum}')
echo "Input tokens: $TOTAL_IN"
echo "Output tokens: $TOTAL_OUT"
echo "Total tokens: $((TOTAL_IN + TOTAL_OUT))"
echo ""
echo "--- Top Skills ---"
grep "$TODAY" "$LOG" | jq -r '.skill_used' 2>/dev/null | sort | uniq -c | sort -rn | head -5
echo ""
echo "--- Errors ---"
echo "Total errors: $(grep "$TODAY" "$LOG" | grep -c '"level":"error"')"
echo "Error types:"
grep "$TODAY" "$LOG" | jq -r 'select(.level=="error") | .error_type' 2>/dev/null | sort | uniq -c | sort -rn | head -5
Understand when your bot is busiest:
#!/bin/bash
# /opt/clawdbot/hourly-heatmap.sh
echo "Hour | Messages | ████████████████"
for hour in $(seq -w 0 23); do
COUNT=$(grep "$(date +%Y-%m-%d)T${hour}:" /var/log/clawdbot/output.log | grep -c 'msg_processed')
BAR=$(printf '%*s' $((COUNT / 5)) '' | tr ' ' '█')
printf "%s | %4d | %s\n" "$hour" "$COUNT" "$BAR"
done
This reveals patterns like: "Most usage happens between 10-11 AM and 2-3 PM" — which helps you plan maintenance windows and capacity.
Find your most expensive conversations:
#!/bin/bash
# /opt/clawdbot/token-analysis.sh
echo "=== Token Cost Analysis ==="
echo ""
echo "Top 10 most expensive users (by total tokens):"
grep "$(date +%Y-%m-%d)" /var/log/clawdbot/output.log | \
jq -r '[.user_id, (.tokens_input + .tokens_output)] | @tsv' 2>/dev/null | \
awk '{a[$1]+=$2} END{for(k in a) print a[k], k}' | sort -rn | head -10
echo ""
echo "Top 5 most expensive skills:"
grep "$(date +%Y-%m-%d)" /var/log/clawdbot/output.log | \
jq -r '[.skill_used, (.tokens_input + .tokens_output)] | @tsv' 2>/dev/null | \
awk '{a[$1]+=$2} END{for(k in a) print a[k], k}' | sort -rn | head -5
All of this requires a stable server with persistent logs and enough compute to run analysis scripts. Tencent Cloud Lighthouse provides exactly that.
Don't just analyze — alert on anomalies:
#!/bin/bash
# /opt/clawdbot/anomaly-check.sh
AVG_RESPONSE=$(grep "$(date +%Y-%m-%d)" /var/log/clawdbot/output.log | \
jq -r '.response_time_ms' 2>/dev/null | awk '{sum+=$1;n++} END{print sum/n}')
if (( $(echo "$AVG_RESPONSE > 5000" | bc -l) )); then
curl -X POST "https://oapi.dingtalk.com/robot/send?access_token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"msgtype\":\"text\",\"text\":{\"content\":\"[ALERT] Avg response time is ${AVG_RESPONSE}ms (threshold: 5000ms)\"}}"
fi
Log analysis isn't a one-time activity — it's a feedback loop. Analyze weekly, identify trends, make changes, and measure the impact.
What you measure, you can improve.