research-paper-popularity-tracker-aws

Research Paper Popularity Tracker

Project Overview

The Research Paper Popularity Tracker is an innovative project designed to track the popularity of research papers in real time. By utilizing Python for web scraping, the system extracts essential details like the title of the paper, author name(s), and abstract from various research sources. These details are dynamically retrieved through an HTTP API powered by AWS API Gateway, which triggers AWS Lambda functions to handle the backend logic. The use of BeautifulSoup for scraping ensures scalability, allowing the system to gather data from multiple sources. The project’s frontend is built using Streamlit, providing users with an intuitive interface to visualize the research paper’s popularity trends and track metrics such as citation count, views, and downloads. With this serverless architecture, the tracker efficiently handles real-time requests and offers a smooth, scalable solution for research monitoring.

Technology Stack

Backend: Python, AWS Lambda

Frontend: Streamlit

Cloud Services: AWS Lambda, API Gateway

API Integration: arXiv

Architecture

Features

Installation

Setup Instructions

#### 1. Clone the repository git clone git clone https://github.com/chinholla/research-paper-popularity-tracker-aws.git

#### 2. Get your API keys: Sign up on arXiv and retrieve your API keys.

#### 3. Deploy the Python code to AWS Lambda: Package your Python code and deploy it using the AWS CLI or the AWS Console.

#### 4. Set up AWS API Gateway:

5. Run Streamlit for the Frontend:

streamlit run app.py

Usage

Track Research Papers:

Interface:

Search Functionality:

Screenshots

Lambda Function:

Lambda Function - Backend Logic

The AWS Lambda function acts as the core backend for the Research Paper Popularity Tracker. It processes the scraping requests and handles the data. In this function, we implemented the logic to scrape research paper data using Python.

1

Web Scraping with BeautifulSoup

BeautifulSoup is used for scraping relevant details from the research paper’s webpage. This includes extracting information such as:

These details are gathered from the website URL provided in the request, making the backend function dynamic and scalable for multiple research sources.

2

API Gateway - HTTP API

3

Overall Backend Workflow:

The backend processes are neatly tied together: the API Gateway handles HTTP requests, AWS Lambda triggers the scraping function written in Python, and the scraped data is then processed and returned. The GET API method ensures that only data retrieval operations occur, maintaining simplicity and speed.

getmethod

Code Overview - app.py

Screenshot 2024-09-22 203533

Authors