A web-based application that utilizes artificial intelligence to generate summaries of PDF and Word documents, as well as create interactive quizzes based on their content. This application operates completely offline, utilizing Hugging Face Transformers models.
- Upload PDF and Word documents
- Extract text from documents
- Generate AI-powered summaries using Hugging Face Transformers
- Create interactive quizzes based on document content
- Clean, responsive user interface
- No API keys or internet connection required after initial setup
- Customizable quiz settings (Coming soon!!)
- Option to view answers
- Option to download summary
- Python 3.7 or higher
- Sufficient disk space for model files (approximately 1.5GB)
- Clone or download this repository
- Install the required dependencies:
pip install -r requirements.txt
- The first time you run the application, it will automatically download the necessary model files from Hugging Face. After that, the application will work offline.
- Start the Flask server:
python app.py
- Open your web browser and navigate to:
http://127.0.0.1:5000
- On the homepage, click the "Choose File" button to select a PDF or Word document
- Click "Upload & Process" to submit the document
- Wait for the AI to process your document (this may take a few moments)
- View the summary tab to see a concise overview of your document
- Switch to the quiz tab to test your knowledge with AI-generated questions
- Use the "Show/Hide Answers" button to toggle answer visibility
- Flask: Web framework
- PyPDF2: PDF text extraction
- python-docx: Word document text extraction
- Hugging Face Transformers: AI-powered summarization and text processing
- PyTorch: Deep learning framework for running the AI models
- Bootstrap: Frontend styling
- Markdown: For formatting the summary
The application uses the facebook/bart-large-cnn
model from Hugging Face, which is specifically trained for text summarization. For longer documents, the text is split into chunks and each chunk is summarized separately, then combined.
The application uses the facebook/bart-large-cnn
model from Hugging Face, which is specifically trained for text summarization. For longer documents, the text is split into chunks and each chunk is summarized separately, then combined.
The application uses a custom algorithm to:
- Extract important sentences from the document
- Identify keywords to create fill-in-the-blank style questions
- Generate plausible distractors (wrong answers) from other parts of the text
- Format everything into a clean multiple-choice quiz
- No API costs or rate limits
- Works offline without internet connection after initial model download
- More privacy since your documents aren't sent to external servers
- Fully customizable - you can swap in different models if needed
- Initial download of models may take some time depending on your internet connection
- Processing very large documents may be slower compared to cloud-based solutions
- Requires more system resources (RAM and disk space) than API-based alternatives
- Support for more document formats
- Custom quiz generation options
- Document comparison functionality
- Save and export summaries and quizzes
- Option to use smaller, faster models for resource-constrained systems
This project is licensed under the MIT License - see the LICENSE file for details.
- Hugging Face for providing the
facebook/bart-large-cnn
model