Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# AIHawk - Technical Documentation

Welcome to the developer documentation for **Jobs_Applier_AI_Agent_AIHawk**. This documentation is designed to help developers understand the architecture, core modules, and workflows of the application.

## 📚 Table of Contents

### [Architecture & Flows](architecture/overview.md)
- **[System Overview](architecture/overview.md)**: High-level architecture, tech stack, and component interactions.
- **[Application Flows](architecture/flows.md)**: Visual diagrams (Mermaid) of startup, resume generation, and parsing flows.

### [Module Documentation](modules/core_logic.md)
- **[Core Logic](modules/core_logic.md)**: Entry points, orchestration, and main application logic.
- **[LLM Integration](modules/llm_integration.md)**: AI model adapters, prompt engineering, and LLM management.
- **[Data Models](modules/data_models.md)**: Resume structures, profile schemas, and data validation.
- **[Configuration](modules/configuration.md)**: Configuration handling, validation, and secrets management.
- **[Utilities](modules/utils.md)**: Shared utility functions and helpers.

## 🚀 Quick Start

Ensure you have the required dependencies and configuration files set up as per the main project [README](../README.md).

```bash
# Run the application
python main.py
```

## 🤝 Contribution

Please refer to [CONTRIBUTING.md](../CONTRIBUTING.md) for guidelines on how to contribute to this project.
98 changes: 98 additions & 0 deletions docs/architecture/flows.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Application Flows

This document details the key workflows within the AIHawk application using Mermaid diagrams.

## 1. App Startup Flow

The application initialization process ensures all configurations and dependencies are ready before user interaction.

```mermaid
graph TD
Start([Start main.py]) --> ValidateData[Validate Data Folder & Files]
ValidateData -->|Check| Secrets[secrets.yaml]
ValidateData -->|Check| Config[config.yaml]
ValidateData -->|Check| Resume[plain_text_resume.yaml]

Secrets -->|Validate| LoadSecrets[Load API Keys]
Config -->|Validate| LoadConfig[Load User Preferences]

LoadSecrets --> PromptUser[Prompt User for Action]
LoadConfig --> PromptUser

PromptUser -->|Select Action| HandleInquiries[Handle Inquiries]
```

## 2. Resume Parsing & Tailoring Flow

How the system takes a specific job URL and tailored a resume for it.

```mermaid
sequenceDiagram
participant User
participant Facade as ResumeFacade
participant Browser as Selenium Browser
participant Parser as LLMJobParser
participant LLM as LLM Service
participant Generator as ResumeGenerator

User->>Facade: Select "Tailor Resume"
Facade->>User: Request Job URL
User->>Facade: Provide URL

Facade->>Browser: Navigate to Job URL
Browser->>Facade: Return Page HTML

Facade->>Parser: Parse HTML
Parser->>LLM: Extract Role, Company, Description
LLM-->>Parser: Structured Job Data

Facade->>Generator: Generate Tailored Resume
Generator->>LLM: Compare Resume vs Job Desc
LLM-->>Generator: Contextual Suggestions

Generator->>Browser: Render HTML Template
Browser->>Facade: Return PDF Bytes
Facade->>User: Save PDF to Output
```

## 3. General Resume Generation Flow

Generating a generic resume without specific job tailoring.

```mermaid
graph TD
User["User Input"] -->|Select Style| StyleManager["Style Manager"]
StyleManager -->|Template Path| Generator["Resume Generator"]

subgraph Generation Process
ResumeData["Load Resume Data"] -->|Inject| Generator
Generator -->|Render| HTML["HTML Resume"]
HTML -->|Convert| PDF["PDF Generator (Selenium)"]
end

PDF --> Output["Output Folder"]
```

## 4. LLM Request Lifecycle

How the system handles requests to the Large Language Model, including logging and error handling.

```mermaid
graph LR
Request[App Request] --> Adapter[AI Adapter]
Adapter -->|Select Provider| ModelFactory{Provider?}

ModelFactory -->|OpenAI| OpenAI[OpenAI Model]
ModelFactory -->|Claude| Claude[Claude Model]
ModelFactory -->|Ollama| Ollama[Ollama Model]

OpenAI --> API[External API]
Claude --> API
Ollama --> Local[Local Inference]

API -->|Response| Logger[LLM Logger]
Local -->|Response| Logger

Logger -->|Log Token Usage| LogFile[open_ai_calls.json]
Logger -->|Return Content| App[Application Logic]
```
66 changes: 66 additions & 0 deletions docs/architecture/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# System Architecture Overview

## Introduction
**Jobs_Applier_AI_Agent_AIHawk** is an automated tool designed to streamline the job application process. It leverages Large Language Models (LLMs) to parse job descriptions, tailor resumes and cover letters, and automate interactions via a web browser.

## High-Level Architecture

The system operates on a modular architecture where the **Core Controller** orchestrates interactions between the **User**, **LLM Service**, **Browser Automation**, and **Data Layer**.

```mermaid
graph TD
User[User] -->|Config & Commands| CLI["CLI Entry Point (main.py)"]

subgraph Core Application
CLI --> Facade[ResumeFacade]
Facade --> generator[ResumeGenerator]
Facade --> Parser[LLMJobParser]
end

subgraph Services
Facade -->|Controls| Browser["Selenium / Chrome Driver"]
Parser -->|Queries| LLM["LLM Manager (OpenAI/Claude/Ollama)"]
generator -->|Queries| LLM
end

subgraph Data Layer
CLI -->|Reads| ConfigFiles["YAML Config & Secrets"]
Facade -->|Reads| ResumeData["Plain Text Resume"]
Facade -->|Writes| Output["PDF Output"]
end
```

## Core Components

### 1. Entry Point & Configuration (`main.py`)
- **Responsibilities**:
- Handles user input via CLI.
- Validates configuration (`secrets.yaml`, `config.yaml`).
- Initializes the application environment.
- **Key Classes**: `ConfigValidator`, `FileManager`.

### 2. Logic Orchestration (`src/libs/resume_and_cover_builder/resume_facade.py`)
- **Responsibilities**:
- Acts as the central hub connecting the UI (CLI) with backend logic.
- Manages the flow of parsing job descriptions and generating documents.
- **Key Classes**: `ResumeFacade`.

### 3. LLM Integration (`src/libs/llm_manager.py`)
- **Responsibilities**:
- Abstracts interactions with various AI providers (OpenAI, Claude, Ollama, Gemini, etc.).
- Manages prompt templates and chains for specific tasks (e.g., summarizing skills, generating cover letters).
- **Key Classes**: `GPTAnswerer`, `AIAdapter`.

### 4. Resume Generation (`src/libs/resume_and_cover_builder/resume_generator.py`)
- **Responsibilities**:
- Fills HTML templates with tailored content.
- Converts HTML to PDF.
- **Key Classes**: `ResumeGenerator`.

## Tech Stack

- **Language**: Python 3.10+
- **Browser Automation**: Selenium WebDriver, ChromeDriverManager
- **LLM Orchestration**: LangChain
- **Configuration**: YAML
- **Data Validation**: Pydantic, Dataclasses
29 changes: 29 additions & 0 deletions docs/modules/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Configuration & Validation

The application relies on three main YAML configuration files located in the `data_folder`.

## Configuration Files

1. **`secrets.yaml`**: Stores sensitive API keys (e.g., `llm_api_key`).
2. **`config.yaml`**: General settings like `remote`, `experience_level`, `locations`, `blacklists`.
3. **`plain_text_resume.yaml`**: The user's resume data in YAML format.

## Config Validator (`main.py`)

The `ConfigValidator` class ensures that the `config.yaml` file contains valid settings before the app runs.

### Validation Rules
- **Required Keys**: Checks for existence of keys like `positions`, `locations`, `distance`.
- **Type Checking**: Ensures values are of correct types (list, bool, int).
- **Enums**: Validates against allowed values:
- `EXPERIENCE_LEVELS`: internship, entry, associate, etc.
- `JOB_TYPES`: full_time, contract, part_time, etc.
- `DATE_FILTERS`: all_time, month, week, 24_hours.
- `APPROVED_DISTANCES`: 0, 5, 10, 25, 50, 100.
- **Email Validation**: Regex checking for email formats.

## File Manager (`main.py`)

The `FileManager` class handles the filesystem interface.
- **`validate_data_folder`**: Ensures `data_folder` exists and contains all required YAML files.
- Creates the `output` directory if it doesn't exist.
41 changes: 41 additions & 0 deletions docs/modules/core_logic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Core Logic & Entry Point

## Main Application Entry (`main.py`)

The `main.py` file serves as the CLI entry point for the application.

### Key Functions

- **`main()`**: The primary execution function.
- Initializes `FileManager` to validate data directories.
- Calls `ConfigValidator` to ensure all YAML configs are correct.
- Invokes `prompt_user_action()` to determine the user's intent.
- Delegates execution to `handle_inquiries()`.

- **`handle_inquiries(selected_actions, parameters, llm_api_key)`**:
- Routes the user's selection to the appropriate `create_*` function.
- Supports: "Generate Resume", "Generate Tailored Resume", "Generate Cover Letter".

- **`promp_user_action()`**:
- Uses the `inquirer` library to present an interactive CLI selection menu.

## Resume Facade (`src/libs/resume_and_cover_builder/resume_facade.py`)

The `ResumeFacade` class implements the Facade pattern to simplify the interface for resume generation operations.

### Responsibilities
- **Initialization**: Sets up the environment, including API keys, style paths, and log output.
- **Job Parsing**: Coordinates with `LLMJobParser` to extract structured data from a raw job URL.
- **Browser Control**: Manages the Selenium driver instance for scraping and PDF generation.

### Key Methods

- **`create_resume_pdf_job_tailored()`**:
- Fetches the selected style.
- Generates HTML using `ResumeGenerator`.
- Converts HTML to PDF via `HTML_to_PDF` utility.

- **`link_to_job(job_url)`**:
- Navigates the browser to the provided URL.
- Extracts the HTML body.
- Initialize `LLMJobParser` to interpret the page content.
33 changes: 33 additions & 0 deletions docs/modules/data_models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Data Models & Schemas

The application uses rigorous data validation to ensure that resume data and job application profiles are well-structured.

## Resume Schema (`src/resume_schemas/resume.py`)

The `Resume` class is defined using `Pydantic` models, ensuring type safety and validation for user-provided data.

### Key Classes
- **`Resume`**: The root model containing all sections.
- **`PersonalInformation`**: Name, email, phone, location, links.
- **`EducationDetails`**: List of education records.
- **`ExperienceDetails`**: List of work experience records.
- **`Project`**, **`Achievement`**, **`Certifications`**, **`Language`**.

**Validation Features:**
- Email format validation (`EmailStr`).
- URL validation for links (`HttpUrl`).
- `normalize_exam_format`: Helper to handle inconsistent data formats in YAML.

## Job Application Profile (`src/resume_schemas/job_application_profile.py`)

Defined as a Python `dataclass`, this model holds user preferences and legal/demographic information often required by job boards.

### Sections
- **`SelfIdentification`**: Gender, veteran status, disability, ethnicity.
- **`LegalAuthorization`**: Work authorization status for US, EU, Canada, UK.
- **`WorkPreferences`**: Remote/hybrid preferences, relocation.
- **`SalaryExpectations`**: Desired salary range.
- **`Availability`**: Notice period.

## Data Loading
The `Resume` and `JobApplicationProfile` classes both include `__init__` methods that accept a YAML string, parsing it into the object structure and raising detailed errors (`ValueError`, `TypeError`) if the input validation fails.
38 changes: 38 additions & 0 deletions docs/modules/llm_integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# LLM Integration

The application relies heavily on Large Language Models (LLMs) for understanding job descriptions and generating human-like text for resumes and cover letters.

## LLM Manager (`src/libs/llm_manager.py`)

This module handles the abstraction layer for different AI providers.

### AI Model Adapter Pattern

The `AIAdapter` class acts as a factory, instantiating the correct model class based on the configuration (`LLM_MODEL_TYPE`).

Supported Providers:
- **OpenAI** (`OpenAIModel`)
- **Claude** (`ClaudeModel`)
- **Ollama** (`OllamaModel`) - for local inference
- **Gemini** (`GeminiModel`)
- **HuggingFace** (`HuggingFaceModel`)
- **Perplexity** (`PerplexityModel`)

### GPTAnswerer

The `GPTAnswerer` class is a high-level service that uses the configured LLM to answer specific questions related to the resume or job application.

**Key Features:**
- **`answer_question_textual_wide_range`**: Determines which section of the resume (e.g., Experience, Education) is relevant to a question and uses an appropriate prompt chain to generate an answer.
- **`is_job_suitable`**: Analyzes the job description against the resume to calculate a suitability score.
- **`summarize_job_description`**: Compresses long job descriptions into concise summaries.

### Logging (`LLMLogger`)
All LLM requests and responses are logged to `open_ai_calls.json` for debugging and cost tracking. It captures:
- Model Name
- Token Usage (Input/Output/Total)
- Estimated Cost
- Prompts and Replies

## Prompt Engineering
Prompts are stored in `src/libs/llm/prompts.py` (referenced in `llm_manager.py`). The application uses `LangChain` templates to structure these prompts dynamically with input variables like `{resume_section}` or `{job_description}`.
17 changes: 17 additions & 0 deletions docs/modules/utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Utilities and Helpers

Common utility functions used throughout the application.

## Chrome Utils (`src/utils/chrome_utils.py`)

Handles interactions with the Chrome browser via Selenium.

- **`init_browser()`**: Initializes a Selenium Chrome driver with specific options (headless mode, user-agent spoofing, window size).
- **`HTML_to_PDF(html_content, driver)`**: Uses the browser's print-to-PDF capability to convert a rendered HTML string into a PDF byte stream.

## Constants (`src/utils/constants.py`)

Central repository for string constants used in prompts and configuration keys.
- **LLM Command Keys**: `PERSONAL_INFORMATION`, `SELF_IDENTIFICATION`, `EXPERIENCE_DETAILS`, etc.
- **File Names**: `SECRETS_YAML`, `WORK_PREFERENCES_YAML`.
- **Model Aliases**: `OPENAI`, `CLAUDE`, `GEMINI`.