Skip to content

dice-group/DBpedia-Chatlog-Analysis

Repository files navigation

DBpedia-Chatlog-Analysis

Discourse analysis for DBpedia chatbot: http://chat.dbpedia.org/

DOI

Description of notebooks:

  1. data_exploration.ipynb houses code for grouping chats w.r.t. user_id and for preliminary analysis, such as, finding average length of conversation and number of users.

  2. In analysis.ipynb, we find -

    • the most used channel (web/slack/facebook messenger)
    • no. of failed responses per conversation and no. of questions that did not satisfy users
    • Conversation length after a negative feedback
    • character length of user-requests
    • perform NER and find commonly asked topics
    • if coreferences exist
    • the language of user-requests
  3. Use dependency_parsing.ipynb to get the estimate of the number of complex questions asked and to prepare input (candidate pairs) for intent clustering.

  4. The clustering folder contains 2 implementations (KMeans and HDBSCAN) for finding the latent-intents in utterance representations. Use get_sentence_embeddings.ipynb, preferably on Google Colab, to fetch sentence embeddings for clustering user-requests based on their semantics.