multimodel
Here are 30 public repositories matching this topic...
This project provides a web-based interface to generate test instructions using Visual Question Answering (VQA). Upload screenshots or images, provide an optional context, and let the model do the rest!
-
Updated
Sep 11, 2024 - Python
End-to-End AI Voice Assistant pipeline with Whisper for Speech-to-Text, Hugging Face LLM for response generation, and Edge-TTS for Text-to-Speech. Features include Voice Activity Detection (VAD), tunable parameters for pitch, gender, and speed, and real-time response with latency optimization.
-
Updated
Sep 2, 2024 - Jupyter Notebook
VyomAI: state-of-the-art NLP LLM Vision MultiModel transformers implementation into Pytorch
-
Updated
Aug 24, 2024 - Python
Simplify time-consuming coding for the data scientist. Create beautiful charts, pandas transformers, and find the best model with the best parameters for your data.
-
Updated
Aug 20, 2024 - Jupyter Notebook
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.
-
Updated
Aug 18, 2024
Implementation of multi vector retrieval RAG for text summarization and Question Answering Tasks
-
Updated
Aug 5, 2024 - Jupyter Notebook
-
Updated
Jul 24, 2024 - Jupyter Notebook
Advanced Invoice Reader Powered by Google Gemini Pro Multimodal AI and Multi Lingual
-
Updated
May 24, 2024 - Python
🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
-
Updated
Apr 29, 2024 - Python
Arogya-Shree-Medi-Bot
-
Updated
Apr 8, 2024 - Python
This project is a multi-modal model that works with multiple models combined and accepts audio, images, and text as inputs, generating corresponding audio, images, and text outputs.
-
Updated
Feb 26, 2024 - Python
ArangoGraph is the easiest way to run ArangoDB. Available on AWS, Google Cloud & Azure.
-
Updated
Feb 26, 2024
📄 SemEval 2024 Task 8: Artificial Intelligence Text Detection System using Natural Language Processing and Neural Network techniques.
-
Updated
Feb 17, 2024 - Jupyter Notebook
yolov5, yolov8, segmenations, face, pose, keypoints on deepstream
-
Updated
Dec 12, 2023 - Jupyter Notebook
large model Zoo collect various of large-scale model, include CV and NLP, multiModel Etc.
-
Updated
Jul 8, 2023
RMDL: Random Multimodel Deep Learning for Classification
-
Updated
May 16, 2023 - Python
🐝 Research on the Control and Prediction of the North American Bumblebee Using Multimodal Algorithms.
-
Updated
Apr 27, 2023 - Python
Improve this page
Add a description, image, and links to the multimodel topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the multimodel topic, visit your repo's landing page and select "manage topics."