Skip to content

ichbinwilly/Cht-Mod-Program-Schedule

Repository files navigation

Cht Mod Program Schedule

The Cht Mod Program Schedule is a sample application created to show how to do web crawler periodically. The project is based on the Express framework and Bootstrap to build a simple app that is deployed to AWS Elastic Beanstalk. And its original project is from AWS Sample. Also I would recommend to follow the Getting Started with Node.js on Elastic Beanstalk to build up the system.

Requirements:

  • express
  • cheerio
  • node-cron

Installation

Execute this command to install the project:

npm install

Run

Execute this command to run the project:

node app

Live Demo

Live Demo on AWS Elastic Beanstalk Sorry, the live demo is no longer available :( (updates on 2018/12/02)

screenshot2

Explain

We use the request module to make http calls.

request(url, (err, res, body) => {
 
 //
 //process here
 //
 
});

Put the result of web crawler into cheerio

const $ = cheerio.load(body)

And finally we analysis and break down the DOM to fetch the data. The program information is wrapped in class wrapper. So the first level is calss rowat. We also need to fetch the class rowat_gray which represents the information in highlighted grey row as well.

$('.wrapper .rowat, .rowat_gray').each(function(i, elem) {
   tvshows.push(
    $(this).text().split('\n')
  )
})

update the latest program every hour

cron.schedule('0 0 */1 * * *', function(){
                  // ↑execute on every hour (*/1 -> (0~24 hour by every one hour) 0 minute 0 second
});

Updates

I'm going to revise this project into a Line bot which provides the program query service. The user can enter keywords and query the most recently programs which contains the keywords.

  • Line bot
  • Carousel
  • Buttons
  • TODO: add entertainment and news

About

Cht-Mod-Program-Schedule

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published