Skip to content

Python scraping and repository for nba data and statistics based on basketball reference

License

Notifications You must be signed in to change notification settings

marcosgabbardo/nbaPredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nbaPredictor

nbaPredictor (Python scraper http://www.basketball-reference.com/leagues/NBA_2017_games.html)

Python scraping and repository for nba data and statistics based on basketball reference (http://www.basketball-reference.com/leagues/NBA_2017_games.html). This code is under havely developemnt.

db folder: contain all scripts from a MYSQL Dump to use as structure for scraped data.

  • db_tables.sql - is all structure you will need to put scraped data.
  • nba_teams_data.sql - is a table with all nba team and abrev names.

lib folder: contain all source code for scrap and save data

  • nba_game.py - core script to scrap and save basic and analytic data from website.
  • nba_statistics.py - core script to calculate some statistical indicators like averages and sums.
  • nba_csvgenerator.py - under construcion (generate csv files for amazon machine learning).

TEST

  • test_nba_server.py - some code to base your tests and understanding of nbaPredictor.
  • amazonML.py - under construction, to use to comunicate with AWS and generate predictions.

Data Structure

  • NBA_TEAM: table with all nba team names and abreviations.
  • NBA_GAME: analytic game table with scores and game statistics.
  • NBA_PLAYBYPLAY: analytic game table with play by play data.
  • NBA_TEAM_HISTORY: table with processed data and statistics, turn by turn (day by day)... this table accumulates all data from previous games until last game and create a sintetic data information for the next game.
  • NBA_PONDERATION: under construction, will be used for user guesses.
  • ODD: under construction, will save all handicaps and odds from some sportbet websites to use in comparation with your data.

How to use

  • With nbaPredictor you can test your strategies for handicaping.

Ex:

  1. First import all data from last 3 season.
  2. Create a sql to select and create a handicap prevision indicator, like this one:
SELECT 
    nba_game.id2,
    nba_game.season,
    nba_game.date,
    home.team_name home_team_name,
    away.team_name away_team_name,
    ROUND(((((away.efg_avg - home.efg_avg) * 100 * 1.5 + 
    (away.ftfga_avg - home.ftfga_avg) * 100 * 0.3 + 
    (away.tov_avg - home.tov_avg) * - 1.5 + 
    (away.orb_avg - home.orb_avg) * 0.5) * 1.16 + 
    (away.day_diff - home.day_diff) * 0.5 + 
    (away.win_streak - home.win_streak) * - 0.1 + 
    (away.loss_streak - home.loss_streak) * 0.15 + 
    (away.win - home.win) * 0.15 + 
    (away.pointavg - home.pointavg) * - 0.14 + 
    (away.pointavg10 - home.pointavg10) * - 0.02 + 
    (away.last5 - home.last5) * 0.15 + 
    (away.last3 - home.last3) * - 0.1 + 
    (away.last10 - home.last10) * 0.15 + 
    (away.last1 - home.last1) * - 0.6 + 
    (away.efg_avg1 - home.efg_avg1) * 100 * - 0.02 + 
    (away.tov_avg1 - home.tov_avg1) * 0.13 + 
    (away.orb_avg1 - home.orb_avg1) * - 0.01 + 
    (away.ftfga_avg1 - home.ftfga_avg1) * - 0.4 + 
    (away.pointavg1a - home.pointavg1a) * - 0.04 + 
    (away.ortg_avg3 - home.ortg_avg3) * - 0.03 + 
    (away.ortg_avg5 - home.ortg_avg5) * - 0.03 + 
    (away.ortg_avg10 - home.ortg_avg10) * - 0.02 + 
	(away.ortg_avg - home.ortg_avg) * + 0.05  
	)) + 2,
            0) predictor,
  --  odd.handicap,        
  --  nba_ponderation.value ponderation,
    (home.pointavga - home.pointavg) media_diff_pontos,
    (nba_game.away_point - nba_game.home_point) decisor
FROM
    nba_game, -- left outer join odd on (odd.id = nba_game.id2),
    nba_team_history home,
    nba_team_history away
   -- ,nba_ponderation
WHERE
    nba_game.home_name = home.team_name
        AND nba_game.away_name = away.team_name
        AND home.date = nba_game.date
        AND away.date = nba_game.date
        AND nba_game.season IN (2017)
        AND nba_game.date = '2017-01-07'
        AND home.efg_avg10 <> 0
      --  AND nba_ponderation.date = nba_game.date
      --  AND nba_ponderation.home_name = home.team_name
      --  AND nba_ponderation.away_name = away.team_name
    --  AND nba_ponderation.value > 60
ORDER BY nba_game.date

Note that you can create your own indicator and compare in a simple sql with decisor (real result of handicap), and evaluate with a standard deviation analisys, remember, smaller standard deviation between decisor and prediction better is your indicator. (handicap predictor by sportbet houses have am average standard deviation of 12 points in handicap full-time scores)

OBS: This souces is only to learning python programming language and a lot of improvements will be necessary.

About

Python scraping and repository for nba data and statistics based on basketball reference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages