feder-cr · lokesh-univest · Dec 17, 2025 · Dec 17, 2025
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,29 @@
+# AIHawk - Technical Documentation
+
+Welcome to the developer documentation for **Jobs_Applier_AI_Agent_AIHawk**. This documentation is designed to help developers understand the architecture, core modules, and workflows of the application.
+
+## 📚 Table of Contents
+
+### [Architecture & Flows](architecture/overview.md)
+- **[System Overview](architecture/overview.md)**: High-level architecture, tech stack, and component interactions.
+- **[Application Flows](architecture/flows.md)**: Visual diagrams (Mermaid) of startup, resume generation, and parsing flows.
+
+### [Module Documentation](modules/core_logic.md)
+- **[Core Logic](modules/core_logic.md)**: Entry points, orchestration, and main application logic.
+- **[LLM Integration](modules/llm_integration.md)**: AI model adapters, prompt engineering, and LLM management.
+- **[Data Models](modules/data_models.md)**: Resume structures, profile schemas, and data validation.
+- **[Configuration](modules/configuration.md)**: Configuration handling, validation, and secrets management.
+- **[Utilities](modules/utils.md)**: Shared utility functions and helpers.
+
+## 🚀 Quick Start
+
+Ensure you have the required dependencies and configuration files set up as per the main project [README](../README.md).
+
+```bash
+# Run the application
+python main.py
+```
+
+## 🤝 Contribution
+
+Please refer to [CONTRIBUTING.md](../CONTRIBUTING.md) for guidelines on how to contribute to this project.
diff --git a/docs/architecture/flows.md b/docs/architecture/flows.md
@@ -0,0 +1,98 @@
+# Application Flows
+
+This document details the key workflows within the AIHawk application using Mermaid diagrams.
+
+## 1. App Startup Flow
+
+The application initialization process ensures all configurations and dependencies are ready before user interaction.
+
+```mermaid
+graph TD
+    Start([Start main.py]) --> ValidateData[Validate Data Folder & Files]
+    ValidateData -->|Check| Secrets[secrets.yaml]
+    ValidateData -->|Check| Config[config.yaml]
+    ValidateData -->|Check| Resume[plain_text_resume.yaml]
+
+    Secrets -->|Validate| LoadSecrets[Load API Keys]
+    Config -->|Validate| LoadConfig[Load User Preferences]
+
+    LoadSecrets --> PromptUser[Prompt User for Action]
+    LoadConfig --> PromptUser
+
+    PromptUser -->|Select Action| HandleInquiries[Handle Inquiries]
+```
+
+## 2. Resume Parsing & Tailoring Flow
+
+How the system takes a specific job URL and tailored a resume for it.
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant Facade as ResumeFacade
+    participant Browser as Selenium Browser
+    participant Parser as LLMJobParser
+    participant LLM as LLM Service
+    participant Generator as ResumeGenerator
+
+    User->>Facade: Select "Tailor Resume"
+    Facade->>User: Request Job URL
+    User->>Facade: Provide URL
+
+    Facade->>Browser: Navigate to Job URL
+    Browser->>Facade: Return Page HTML
+
+    Facade->>Parser: Parse HTML
+    Parser->>LLM: Extract Role, Company, Description
+    LLM-->>Parser: Structured Job Data
+
+    Facade->>Generator: Generate Tailored Resume
+    Generator->>LLM: Compare Resume vs Job Desc
+    LLM-->>Generator: Contextual Suggestions
+
+    Generator->>Browser: Render HTML Template
+    Browser->>Facade: Return PDF Bytes
+    Facade->>User: Save PDF to Output
+```
+
+## 3. General Resume Generation Flow
+
+Generating a generic resume without specific job tailoring.
+
+```mermaid
+graph TD
+    User["User Input"] -->|Select Style| StyleManager["Style Manager"]
+    StyleManager -->|Template Path| Generator["Resume Generator"]
+
+    subgraph Generation Process
+        ResumeData["Load Resume Data"] -->|Inject| Generator
+        Generator -->|Render| HTML["HTML Resume"]
+        HTML -->|Convert| PDF["PDF Generator (Selenium)"]
+    end
+
+    PDF --> Output["Output Folder"]
+```
+
+## 4. LLM Request Lifecycle
+
+How the system handles requests to the Large Language Model, including logging and error handling.
+
+```mermaid
+graph LR
+    Request[App Request] --> Adapter[AI Adapter]
+    Adapter -->|Select Provider| ModelFactory{Provider?}
+
+    ModelFactory -->|OpenAI| OpenAI[OpenAI Model]
+    ModelFactory -->|Claude| Claude[Claude Model]
+    ModelFactory -->|Ollama| Ollama[Ollama Model]
+
+    OpenAI --> API[External API]
+    Claude --> API
+    Ollama --> Local[Local Inference]
+
+    API -->|Response| Logger[LLM Logger]
+    Local -->|Response| Logger
+
+    Logger -->|Log Token Usage| LogFile[open_ai_calls.json]
+    Logger -->|Return Content| App[Application Logic]
+```
diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md
@@ -0,0 +1,66 @@
+# System Architecture Overview
+
+## Introduction
+**Jobs_Applier_AI_Agent_AIHawk** is an automated tool designed to streamline the job application process. It leverages Large Language Models (LLMs) to parse job descriptions, tailor resumes and cover letters, and automate interactions via a web browser.
+
+## High-Level Architecture
+
+The system operates on a modular architecture where the **Core Controller** orchestrates interactions between the **User**, **LLM Service**, **Browser Automation**, and **Data Layer**.
+
+```mermaid
+graph TD
+    User[User] -->|Config & Commands| CLI["CLI Entry Point (main.py)"]
+
+    subgraph Core Application
+        CLI --> Facade[ResumeFacade]
+        Facade --> generator[ResumeGenerator]
+        Facade --> Parser[LLMJobParser]
+    end
+
+    subgraph Services
+        Facade -->|Controls| Browser["Selenium / Chrome Driver"]
+        Parser -->|Queries| LLM["LLM Manager (OpenAI/Claude/Ollama)"]
+        generator -->|Queries| LLM
+    end
+
+    subgraph Data Layer
+        CLI -->|Reads| ConfigFiles["YAML Config & Secrets"]
+        Facade -->|Reads| ResumeData["Plain Text Resume"]
+        Facade -->|Writes| Output["PDF Output"]
+    end
+```
+
+## Core Components
+
+### 1. Entry Point & Configuration (`main.py`)
+- **Responsibilities**: 
+  - Handles user input via CLI.
+  - Validates configuration (`secrets.yaml`, `config.yaml`).
+  - Initializes the application environment.
+- **Key Classes**: `ConfigValidator`, `FileManager`.
+
+### 2. Logic Orchestration (`src/libs/resume_and_cover_builder/resume_facade.py`)
+- **Responsibilities**:
+  - Acts as the central hub connecting the UI (CLI) with backend logic.
+  - Manages the flow of parsing job descriptions and generating documents.
+- **Key Classes**: `ResumeFacade`.
+
+### 3. LLM Integration (`src/libs/llm_manager.py`)
+- **Responsibilities**:
+  - Abstracts interactions with various AI providers (OpenAI, Claude, Ollama, Gemini, etc.).
+  - Manages prompt templates and chains for specific tasks (e.g., summarizing skills, generating cover letters).
+- **Key Classes**: `GPTAnswerer`, `AIAdapter`.
+
+### 4. Resume Generation (`src/libs/resume_and_cover_builder/resume_generator.py`)
+- **Responsibilities**:
+  - Fills HTML templates with tailored content.
+  - Converts HTML to PDF.
+- **Key Classes**: `ResumeGenerator`.
+
+## Tech Stack
+
+- **Language**: Python 3.10+
+- **Browser Automation**: Selenium WebDriver, ChromeDriverManager
+- **LLM Orchestration**: LangChain
+- **Configuration**: YAML
+- **Data Validation**: Pydantic, Dataclasses
diff --git a/docs/modules/configuration.md b/docs/modules/configuration.md
@@ -0,0 +1,29 @@
+# Configuration & Validation
+
+The application relies on three main YAML configuration files located in the `data_folder`.
+
+## Configuration Files
+
+1. **`secrets.yaml`**: Stores sensitive API keys (e.g., `llm_api_key`).
+2. **`config.yaml`**: General settings like `remote`, `experience_level`, `locations`, `blacklists`.
+3. **`plain_text_resume.yaml`**: The user's resume data in YAML format.
+
+## Config Validator (`main.py`)
+
+The `ConfigValidator` class ensures that the `config.yaml` file contains valid settings before the app runs.
+
+### Validation Rules
+- **Required Keys**: Checks for existence of keys like `positions`, `locations`, `distance`.
+- **Type Checking**: Ensures values are of correct types (list, bool, int).
+- **Enums**: Validates against allowed values:
+  - `EXPERIENCE_LEVELS`: internship, entry, associate, etc.
+  - `JOB_TYPES`: full_time, contract, part_time, etc.
+  - `DATE_FILTERS`: all_time, month, week, 24_hours.
+  - `APPROVED_DISTANCES`: 0, 5, 10, 25, 50, 100.
+- **Email Validation**: Regex checking for email formats.
+
+## File Manager (`main.py`)
+
+The `FileManager` class handles the filesystem interface.
+- **`validate_data_folder`**: Ensures `data_folder` exists and contains all required YAML files.
+- Creates the `output` directory if it doesn't exist.
diff --git a/docs/modules/core_logic.md b/docs/modules/core_logic.md
@@ -0,0 +1,41 @@
+# Core Logic & Entry Point
+
+## Main Application Entry (`main.py`)
+
+The `main.py` file serves as the CLI entry point for the application.
+
+### Key Functions
+
+- **`main()`**: The primary execution function.
+  - Initializes `FileManager` to validate data directories.
+  - Calls `ConfigValidator` to ensure all YAML configs are correct.
+  - Invokes `prompt_user_action()` to determine the user's intent.
+  - Delegates execution to `handle_inquiries()`.
+
+- **`handle_inquiries(selected_actions, parameters, llm_api_key)`**:
+  - Routes the user's selection to the appropriate `create_*` function.
+  - Supports: "Generate Resume", "Generate Tailored Resume", "Generate Cover Letter".
+
+- **`promp_user_action()`**:
+  - Uses the `inquirer` library to present an interactive CLI selection menu.
+
+## Resume Facade (`src/libs/resume_and_cover_builder/resume_facade.py`)
+
+The `ResumeFacade` class implements the Facade pattern to simplify the interface for resume generation operations.
+
+### Responsibilities
+- **Initialization**: Sets up the environment, including API keys, style paths, and log output.
+- **Job Parsing**: Coordinates with `LLMJobParser` to extract structured data from a raw job URL.
+- **Browser Control**: Manages the Selenium driver instance for scraping and PDF generation.
+
+### Key Methods
+
+- **`create_resume_pdf_job_tailored()`**:
+  - Fetches the selected style.
+  - Generates HTML using `ResumeGenerator`.
+  - Converts HTML to PDF via `HTML_to_PDF` utility.
+
+- **`link_to_job(job_url)`**:
+  - Navigates the browser to the provided URL.
+  - Extracts the HTML body.
+  - Initialize `LLMJobParser` to interpret the page content.
diff --git a/docs/modules/data_models.md b/docs/modules/data_models.md
@@ -0,0 +1,33 @@
+# Data Models & Schemas
+
+The application uses rigorous data validation to ensure that resume data and job application profiles are well-structured.
+
+## Resume Schema (`src/resume_schemas/resume.py`)
+
+The `Resume` class is defined using `Pydantic` models, ensuring type safety and validation for user-provided data.
+
+### Key Classes
+- **`Resume`**: The root model containing all sections.
+- **`PersonalInformation`**: Name, email, phone, location, links.
+- **`EducationDetails`**: List of education records.
+- **`ExperienceDetails`**: List of work experience records.
+- **`Project`**, **`Achievement`**, **`Certifications`**, **`Language`**.
+
+**Validation Features:**
+- Email format validation (`EmailStr`).
+- URL validation for links (`HttpUrl`).
+- `normalize_exam_format`: Helper to handle inconsistent data formats in YAML.
+
+## Job Application Profile (`src/resume_schemas/job_application_profile.py`)
+
+Defined as a Python `dataclass`, this model holds user preferences and legal/demographic information often required by job boards.
+
+### Sections
+- **`SelfIdentification`**: Gender, veteran status, disability, ethnicity.
+- **`LegalAuthorization`**: Work authorization status for US, EU, Canada, UK.
+- **`WorkPreferences`**: Remote/hybrid preferences, relocation.
+- **`SalaryExpectations`**: Desired salary range.
+- **`Availability`**: Notice period.
+
+## Data Loading
+The `Resume` and `JobApplicationProfile` classes both include `__init__` methods that accept a YAML string, parsing it into the object structure and raising detailed errors (`ValueError`, `TypeError`) if the input validation fails.
diff --git a/docs/modules/llm_integration.md b/docs/modules/llm_integration.md
@@ -0,0 +1,38 @@
+# LLM Integration
+
+The application relies heavily on Large Language Models (LLMs) for understanding job descriptions and generating human-like text for resumes and cover letters.
+
+## LLM Manager (`src/libs/llm_manager.py`)
+
+This module handles the abstraction layer for different AI providers.
+
+### AI Model Adapter Pattern
+
+The `AIAdapter` class acts as a factory, instantiating the correct model class based on the configuration (`LLM_MODEL_TYPE`).
+
+Supported Providers:
+- **OpenAI** (`OpenAIModel`)
+- **Claude** (`ClaudeModel`)
+- **Ollama** (`OllamaModel`) - for local inference
+- **Gemini** (`GeminiModel`)
+- **HuggingFace** (`HuggingFaceModel`)
+- **Perplexity** (`PerplexityModel`)
+
+### GPTAnswerer
+
+The `GPTAnswerer` class is a high-level service that uses the configured LLM to answer specific questions related to the resume or job application.
+
+**Key Features:**
+- **`answer_question_textual_wide_range`**: Determines which section of the resume (e.g., Experience, Education) is relevant to a question and uses an appropriate prompt chain to generate an answer.
+- **`is_job_suitable`**: Analyzes the job description against the resume to calculate a suitability score.
+- **`summarize_job_description`**: Compresses long job descriptions into concise summaries.
+
+### Logging (`LLMLogger`)
+All LLM requests and responses are logged to `open_ai_calls.json` for debugging and cost tracking. It captures:
+- Model Name
+- Token Usage (Input/Output/Total)
+- Estimated Cost
+- Prompts and Replies
+
+## Prompt Engineering
+Prompts are stored in `src/libs/llm/prompts.py` (referenced in `llm_manager.py`). The application uses `LangChain` templates to structure these prompts dynamically with input variables like `{resume_section}` or `{job_description}`.
diff --git a/docs/modules/utils.md b/docs/modules/utils.md
@@ -0,0 +1,17 @@
+# Utilities and Helpers
+
+Common utility functions used throughout the application.
+
+## Chrome Utils (`src/utils/chrome_utils.py`)
+
+Handles interactions with the Chrome browser via Selenium.
+
+- **`init_browser()`**: Initializes a Selenium Chrome driver with specific options (headless mode, user-agent spoofing, window size).
+- **`HTML_to_PDF(html_content, driver)`**: Uses the browser's print-to-PDF capability to convert a rendered HTML string into a PDF byte stream.
+
+## Constants (`src/utils/constants.py`)
+
+Central repository for string constants used in prompts and configuration keys.
+- **LLM Command Keys**: `PERSONAL_INFORMATION`, `SELF_IDENTIFICATION`, `EXPERIENCE_DETAILS`, etc.
+- **File Names**: `SECRETS_YAML`, `WORK_PREFERENCES_YAML`.
+- **Model Aliases**: `OPENAI`, `CLAUDE`, `GEMINI`.