This document uses an SCF function and Puppeteer to perform scheduled tasks on webpage content such as data collection and storage. You can also perform complicated scheduled web tasks like data crawling, scheduled sign-in, and webpage inspection.
- Log in to the SCF console and select Function Service on the left sidebar.
- At the top of the Function Service page, select the Beijing region and click Create to enter the function creating page and configure the function as shown below:
- Creation method: select Template.
- Fuzzy search: enter TimerTask and search.
Click Learn More in the template to view relevant information in the Template Details pop-up window, which can be downloaded.
- Click Next, and the function name will be automatically generated by default. If you need to modify the function code, click to expand the Function Code block and make changes as instructed in Modifying function template.
- In the Trigger Configurations section, select Automatic creation, and a scheduled trigger that will run on the hour will be created by default as shown below:
- If you need to adjust the trigger configuration according to your needs, please select Custom.
- To create a scheduled trigger after the test is successful, please select Create Later.
- Click Complete.
- At the bottom of the function code page, click Test to view the execution log of the function.
- After the test is passed, you can configure the scheduled trigger in the Trigger Method tab according to the actual situation and check the related Base64 value.
Modifying function template
The current template function references Puppeteer to screencapture the webpage content and convert it to a Base64 value for printout in the function log. You can modify the template according to your own scheduled task needs.
For example, run the following command to get the page title:
const = await . ();
Add the following code to set the page click attribute.
For more information on how to use Puppeteer, please see here. With this tool, you can access page content at scheduled times and perform task operations on the page, such as data crawling and sign-in.