This is the official repository of NLP Young OG's group project in FINA 4350.
Chen Kangyi, Kevin (3035447776)
Cui Jinze, Corey (3035447922)
Li Jiaying, Minnie (3035447348)
Qu Yiyang, Yvonne (3035447623)
Wu Shaoyi, Sophie (3035330179)
- scraping contains the code for web-scraping
- get_files.py: download financial reports and earnings transcripts from Capital IQ
- get_location.py: get the location of companies from Yahoo Finance
- preprocessing contains the code for preprocessing
- convert_format.py: convert the files from doc. to txt.
- preprocess_transcript.py: preprocess earnings transcript
- preprocess_report.py: preprocess financial report
- NLP_analysis contains the code for calculating the features using NLP
- dictionary_analysis.py: calculate the dictionary-based features
- textblob_vader_analysis.py: calculate the TB and VD scores
- finbert_analysis.py: calculate the FinBert score
- result_analysis.py: contains the code for analyzing the NLP results
- company_price.py: process the company stock price
- company_analysis.py: conduct the company level analysis
- industry_analysis.py: conduct the industry level analysis
- Google Drive link to our dataset