Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
d128775
resolves #7
czajkub Aug 25, 2025
c36d342
changed xml.py name to avoid name conflicts
czajkub Sep 3, 2025
a215131
added clickhouse as an alternative to elasticsearch
czajkub Sep 3, 2025
ea780d4
Merge branch 'the-momentum:main' into main
czajkub Sep 3, 2025
186bba0
deleted the old xml.py file (name change)
czajkub Sep 3, 2025
9dc9aeb
changed project files and gitignore to reflect clickhouse addition
czajkub Sep 3, 2025
d66ad02
fixed search_health_records tool
czajkub Sep 3, 2025
91955a7
Merge branch 'main' of https://github.com/czajkub/apple-health-mcp-se…
czajkub Sep 3, 2025
311e9fd
pretty printing time results
czajkub Sep 3, 2025
323c117
brushed up on every tool, fixed naming issues, completed sql queries
czajkub Sep 4, 2025
01f3660
changed return types for linter
czajkub Sep 4, 2025
8c4567c
changed update_database comment for LLM
czajkub Sep 4, 2025
e9bf8e1
Update README to include ClickHouse
czajkub Sep 4, 2025
738a4fa
update makefile for clickhouse
czajkub Sep 4, 2025
45e4597
Update README.md for windows usage of clickhouse
czajkub Sep 4, 2025
d1f55e8
clickhouse support for windows and changing variables to more legible…
czajkub Sep 4, 2025
45c662f
Merge branch 'main' of https://github.com/czajkub/apple-health-mcp-se…
czajkub Sep 4, 2025
2013b17
added makefile comments and improved windows functionality
czajkub Sep 5, 2025
9b8e8ab
add more settings and change variable names for readability
czajkub Sep 5, 2025
2495b10
added chunk_size to settings
czajkub Sep 5, 2025
d1068cf
minor fixes
czajkub Sep 5, 2025
b2af44d
comment for clarification and changed inequality sign to point in the…
czajkub Sep 5, 2025
32dcf01
updating records more elegant now
czajkub Sep 5, 2025
1430f34
moved column_names tuple to broader scope
czajkub Sep 5, 2025
777322f
moving ch.py into scripts and splitting it into two files
czajkub Sep 5, 2025
a929b1e
fixed relative import and dockerfile for windows
czajkub Sep 5, 2025
92e19f9
Update README.md to include new env variables
czajkub Sep 5, 2025
d781104
change docker launch
czajkub Sep 5, 2025
d882907
improved readme comment about docker deployment
czajkub Sep 5, 2025
2bb0d54
FINALLY working windows clickhouse support with docker
czajkub Sep 5, 2025
fd76d48
Merge branch 'main' of https://github.com/czajkub/apple-health-mcp-se…
czajkub Sep 5, 2025
549c645
removal of docker volume after make chwin
czajkub Sep 5, 2025
4361ff3
fixed makefile for ch linux
czajkub Sep 5, 2025
b4af660
added db dirname to config
czajkub Sep 5, 2025
253a15c
separating ch client and indexer
czajkub Sep 5, 2025
215ca16
calling inits explicitly
czajkub Sep 5, 2025
a35dace
super ultra fix
czajkub Sep 5, 2025
b692083
removed update db mcp tool
czajkub Sep 5, 2025
fc2e165
Update README.md
czajkub Sep 8, 2025
9cd48f8
changed ch dockerfile name
czajkub Sep 8, 2025
27a9959
removed applehealth db
czajkub Sep 8, 2025
a717404
added chdb back to gitignore
czajkub Sep 8, 2025
4b53fbe
Merge branch 'main' of https://github.com/czajkub/apple-health-mcp-se…
czajkub Sep 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,4 @@ Makefile
.*
README.md
*.xml
*.chdb
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -148,4 +148,7 @@ docker/volumes/
volumes

# Data Source
*.xml
*.xml

# ClickHouse database
*.chdb
31 changes: 31 additions & 0 deletions Dockerfile.ch
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
FROM ghcr.io/astral-sh/uv:python3.13-bookworm

WORKDIR /app

ENV UV_COMPILE_BYTECODE=1
ENV UV_LINK_MODE=copy
ENV UV_TOOL_BIN_DIR=/usr/local/bin

RUN --mount=type=cache,target=/root/.cache/uv \
--mount=type=bind,source=uv.lock,target=uv.lock \
--mount=type=bind,source=pyproject.toml,target=pyproject.toml \
uv sync --locked --no-install-project --no-dev

COPY . /app
RUN mv /app/xmltemp123 /app/scripts/raw.xml
RUN --mount=type=cache,target=/root/.cache/uv \
uv sync --locked --no-dev

ENV PATH="/app/.venv/bin:$PATH"

RUN echo '#!/bin/bash\n\
set -e\n\
echo "Running clickhouse importer..."\n\
uv run --directory /app/scripts/ clickhouse_importer.py && \
echo "Copying applehealth.chdb to volume..." && \
cp -r /app/scripts/applehealth.chdb /volume/applehealth.chdb && \
echo "Complete!"' > /app/entrypoint.sh

RUN chmod +x /app/entrypoint.sh

ENTRYPOINT ["/app/entrypoint.sh"]
12 changes: 12 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,17 @@ es: ## Run Elasticsearch and import Apple Health XML data into ES for Apple Hea
./scripts/run_elasticsearch.sh
$(UV) python scripts/xml2es.py

ch: ## Import Apple Health XML data into a docker volume for ClickHouse
$(UV) scripts/clickhouse_importer.py

chwin: ## Import Apple Health XML data into a docker volume for ClickHouse (for Windows users)
move *.xml xmltemp123
docker volume create applehealth-data
docker build . --file Dockerfile.ch -t uvcopier
docker run --rm -v applehealth-data:/volume uvcopier
docker run --rm -v applehealth-data:/source -v $pwd/:/dest alpine cp -r /source/applehealth.chdb /dest/
move xmltemp123 raw.xml
docker volume rm applehealth-data

downgrade: ## Revert the last migration
$(ALEMBIC_CMD) downgrade -1
29 changes: 24 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,22 @@
- [🚀 Getting Started](#-getting-started)
- [📝 Usage](#-usage)
- [🔧 Configuration](#-configuration)
- [🐳 Docker Setup](#-docker-setup)
- [🐳 Docker Setup](#-docker-mcp)
- [🛠️ MCP Tools](#️-mcp-tools)
- [🗺️ Roadmap](#️-roadmap)
- [👥 Contributors](#-contributors)
- [📄 License](#-license)

## 🔍 About The Project

**Apple Health MCP Server** implements a Model Context Protocol (MCP) server designed for seamless interaction between LLM-based agents and Apple Health data. It provides a standardized interface for querying, analyzing, and managing Apple Health records—imported from XML exports and indexed in Elasticsearch—through a comprehensive suite of tools. These tools are accessible from MCP-compatible clients (such as Claude Desktop), enabling users to explore, search, and analyze personal health data using natural-language prompts and advanced filtering, all without requiring direct knowledge of the underlying data formats or Elasticsearch queries.
**Apple Health MCP Server** implements a Model Context Protocol (MCP) server designed for seamless interaction between LLM-based agents and Apple Health data. It provides a standardized interface for querying, analyzing, and managing Apple Health records—imported from XML exports and indexed in Elasticsearch or Clickhouse—through a comprehensive suite of tools. These tools are accessible from MCP-compatible clients (such as Claude Desktop), enabling users to explore, search, and analyze personal health data using natural-language prompts and advanced filtering, all without requiring direct knowledge of the underlying data formats or Elasticsearch/ClickHouse queries.

### ✨ Key Features

- **🚀 FastMCP Framework**: Built on FastMCP for high-performance MCP server capabilities
- **🍏 Apple Health Data Management**: Import, parse, and analyze Apple Health XML exports
- **🔎 Powerful Search & Filtering**: Query and filter health records using natural language and advanced parameters
- **📦 Elasticsearch Integration**: Index and search health data efficiently at scale
- **📦 Elasticsearch and ClickHouse Integration**: Index and search health data efficiently at scale
- **🛠️ Modular MCP Tools**: Tools for structure analysis, record search, type-based extraction, and more
- **📈 Data Summaries & Trends**: Generate statistics and trend analyses from your health data
- **🐳 Container Ready**: Docker support for easy deployment and scaling
Expand All @@ -49,6 +49,7 @@ The Apple Health MCP Server is built with a modular, extensible architecture des
- **MCP Tools**: Dedicated tools for Apple Health XML structure analysis, record search, type-based extraction, and statistics/trend generation. Each tool is accessible via the MCP protocol for natural language and programmatic access.
- **XML Import & Parsing**: Efficient streaming and parsing of large Apple Health XML exports, extracting records, workouts, and metadata for further analysis.
- **Elasticsearch Backend**: All health records are indexed in Elasticsearch, enabling fast, scalable search, filtering, and aggregation across large datasets.
- **ClickHouse Backend**: Health records can also be indexed to a ClickHouse database, making the deployment easier for the enduser by using an in-memory database instead of a server-based approach.
- **Service Layer**: Business logic for XML and Elasticsearch operations is encapsulated in dedicated service modules, ensuring separation of concerns and easy extensibility.
- **FastMCP Framework**: Provides the MCP server interface, routing, and tool registration, making the system compatible with LLM-based agents and MCP clients (e.g., Claude Desktop).
- **Configuration & Deployment**: Environment-based configuration and Docker support for easy setup and deployment in various environments.
Expand Down Expand Up @@ -96,6 +97,10 @@ Follow these steps to set up Apple Health MCP Server in your environment.
```sh
uv run python scripts/xml2es.py --delete-all
```
3. If you choose to use ClickHouse instead of Elasticsearch:
- Run `make ch` to create a database with your exported XML data
- **Note: If you are using Windows, Docker is the only way to integrate ClickHouse into this MCP Server.**
- On Windows: Run `mingw32-make chwin` (or any other version of `make` available on Windows)

### Configuration Files

Expand Down Expand Up @@ -126,6 +131,8 @@ You can run the MCP Server in your LLM Client in two ways:
"type=bind,source=<project-path>/app,target=/root_project/app", // optional
"--mount",
"type=bind,source=<project-path>/config/.env,target=/root_project/config/.env",
"--mount", // optional - only include this if you use clickhouse
"type=bind,source=<project-path>/applehealth.chdb,target=/root_project/applehealth.chdb", // optional
"-e",
"ES_HOST=host.docker.internal",
"mcp-server:latest"
Expand Down Expand Up @@ -192,14 +199,17 @@ After completing the above steps, restart your MCP Client to apply the changes.
| ES_USER | Elasticsearch username | `elastic` | ❌ |
| ES_PASSWORD | Elasticsearch password | `elastic` | ❌ |
| ES_INDEX | Elasticsearch index name | `apple_health_data` | ❌ |
| CH_DB_NAME | ClickHouse database name | `applehealth` | ❌ |
| CH_TABLE_NAME | ClickHouse table name | `data` | ❌ |
| CHUNK_SIZE | Number of records indexed into CH at once | `10000` | ❌ |
| XML_SAMPLE_SIZE | Number of XML records to sample | `1000` | ❌ |

<p align="right">(<a href="#readme-top">back to top</a>)</p>


## 🛠️ MCP Tools

The Apple Health MCP Server provides a suite of tools for exploring, searching, and analyzing your Apple Health data, both at the raw XML level and in Elasticsearch:
The Apple Health MCP Server provides a suite of tools for exploring, searching, and analyzing your Apple Health data, both at the raw XML level and in Elasticsearch/ClickHouse:

### XML Tools (`xml_reader`)

Expand All @@ -218,6 +228,15 @@ The Apple Health MCP Server provides a suite of tools for exploring, searching,
| `get_statistics_by_type_es` | Get comprehensive statistics (count, min, max, avg, sum) for a specific health record type. |
| `get_trend_data_es` | Analyze trends for a health record type over time (daily, weekly, monthly, yearly aggregations). |

### ClickHouse Tools (`ch_reader`)

| Tool | Description |
|-----------------------------|-----------------------------------------------------------------------------------------------------|
| `get_health_summary_ch` | Get a summary of all Apple Health data in ClickHouse (total count, type breakdown, etc.). |
| `search_health_records_ch` | Flexible search for health records in ClickHouse with advanced filtering and query options. |
| `get_statistics_by_type_ch` | Get comprehensive statistics (count, min, max, avg, sum) for a specific health record type. |
| `get_trend_data_ch` | Analyze trends for a health record type over time (daily, weekly, monthly, yearly aggregations). |

All tools are accessible via MCP-compatible clients and can be used with natural language or programmatic queries to explore and analyze your Apple Health data.

<p align="right">(<a href="#readme-top">back to top</a>)</p>
Expand Down Expand Up @@ -250,4 +269,4 @@ Distributed under the MIT License. See [MIT License](LICENSE) for more informati

<div align="center">
<p><em>Built with ❤️ by <a href="https://themomentum.ai">Momentum</a> • Transforming healthcare data management with AI</em></p>
</div>
</div>
5 changes: 5 additions & 0 deletions app/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ class Settings(BaseSettings):
ES_PASSWORD: SecretStr = SecretStr("elastic")
ES_INDEX: str = "apple_health_data"

CH_DIRNAME: str = "applehealth.chdb"
CH_DB_NAME: str = "applehealth"
CH_TABLE_NAME: str = "data"
CHUNK_SIZE: int = 10_000

RAW_XML_PATH: str = "raw.xml"
XML_SAMPLE_SIZE: int = 1000

Expand Down
3 changes: 2 additions & 1 deletion app/mcp/v1/mcp.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
from fastmcp import FastMCP

from app.mcp.v1.tools import es_reader, xml_reader
from app.mcp.v1.tools import es_reader, xml_reader, ch_reader

mcp_router = FastMCP(name="Main MCP")

mcp_router.mount(es_reader.es_reader_router)
mcp_router.mount(xml_reader.xml_reader_router)
mcp_router.mount(ch_reader.ch_reader_router)
125 changes: 125 additions & 0 deletions app/mcp/v1/tools/ch_reader.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
from typing import Any
from fastmcp import FastMCP

from app.schemas.record import RecordType, IntervalType, HealthRecordSearchParams
from app.services.health.clickhouse import (
get_health_summary_from_ch,
search_health_records_from_ch,
get_statistics_by_type_from_ch,
get_trend_data_from_ch,
)

ch_reader_router = FastMCP(name="CH Reader MCP")

@ch_reader_router.tool
def get_health_summary_ch() -> dict[str, Any]:
"""
Get a summary of Apple Health data from ClickHouse.
The function returns total record count, record type breakdown, and (optionally) a date range aggregation.

Notes for LLM:
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, try to use my data from ClickHouse first.
"""
try:
return get_health_summary_from_ch()
except Exception as e:
return {'error': str(e)}

@ch_reader_router.tool
def search_health_records_ch(params: HealthRecordSearchParams) -> dict[str, Any]:
"""
Search health records in ClickHouse with flexible query building.

Parameters:
- params: HealthRecordSearchParams object containing all search/filter parameters.

Notes for LLMs:
- This function should return a list of health record documents (dicts) matching the search criteria.
- Each document in the list should represent a single health record as stored in ClickHouse.
- If an error occurs, the function should return a list with a single dict containing an 'error' key and the error message.
- Use this to retrieve structured health data for further analysis, filtering, or display.
- Example source_name: "Rob’s iPhone", "Polar Flow", "Sync Solver".
- Example date_from/date_to: "2020-01-01T00:00:00+00:00"
- Example value_min/value_max: "10", "100.5"
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, try to use my data from ClickHouse first.
"""
try:
return search_health_records_from_ch(params)
except Exception as e:
return {'error': str(e)}

@ch_reader_router.tool
def get_statistics_by_type_ch(record_type: RecordType | str) -> dict[str, Any]:
"""
Get comprehensive statistics for a specific health record type from ClickHouse.

Parameters:
- record_type: The type of health record to analyze. Use RecordType for most frequent types. Use str if that type is beyond RecordType scope.

Returns:
- record_type: The analyzed record type
- total_count: Total number of records of this type in the index
- value_statistics: Statistical summary of the 'value' field including:
* count: Number of records with values
* min: Minimum value recorded
* max: Maximum value recorded
* avg: Average value across all records
* sum: Sum of all values
- sources: Breakdown of records by source device/app (e.g., "Rob's iPhone", "Polar Flow")

Notes for LLMs:
- This function provides comprehensive statistical analysis for any health record type.
- The value_statistics object contains all basic statistics (count, min, max, avg, sum) for the 'value' field.
- The sources breakdown shows which devices/apps contributed data for this record type.
- Example types: "HKQuantityTypeIdentifierStepCount", "HKQuantityTypeIdentifierBodyMassIndex", "HKQuantityTypeIdentifierHeartRate", etc.
- Use this function to understand the distribution, range, and trends of specific health metrics.
- The function is useful for health analysis, identifying outliers, and understanding data quality.
- date_range key for query is commented, since it contained hardcoded from date, but you can use it anyway if you replace startDate with your data.
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, try to use my data from ClickHouse first.
"""
try:
return get_statistics_by_type_from_ch(record_type)
except Exception as e:
return {"error": f"Failed to get statistics: {str(e)}"}


@ch_reader_router.tool
def get_trend_data_ch(
record_type: RecordType | str,
interval: IntervalType = "month",
date_from: str | None = None,
date_to: str | None = None,
) -> dict[str, Any]:
"""
Get trend data for a specific health record type over time using ClickHouse date histogram aggregation.

Parameters:
- record_type: The type of health record to analyze (e.g., "HKQuantityTypeIdentifierStepCount")
- interval: Time interval for aggregation.
- date_from, date_to: Optional ISO8601 date strings for filtering date range

Returns:
- record_type: The analyzed record type
- interval: The time interval used
- trend_data: List of time buckets with statistics for each period:
* date: The time period (ISO string)
* avg_value: Average value for the period
* min_value: Minimum value for the period
* max_value: Maximum value for the period
* count: Number of records in the period

Notes for LLMs:
- Use this to analyze trends, patterns, and seasonal variations in health data
- The function automatically handles date filtering if date_from/date_to are provided
- IMPORTANT - interval must be one of: "day", "week", "month", or "year". Do not use other values.
- Do not guess, autofill, or assume any missing data.
- When asked for medical advice, try to use my data from ClickHouse first.
"""
try:
return get_trend_data_from_ch(record_type, interval, date_from, date_to)
except Exception as e:
return {"error": f"Failed to get trend data: {str(e)}"}

2 changes: 1 addition & 1 deletion app/mcp/v1/tools/xml_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from fastmcp import FastMCP

from app.schemas.record import RecordType
from app.services.health.xml import analyze_xml_structure, search_xml, get_records_by_type
from app.services.health.direct_xml import analyze_xml_structure, search_xml, get_records_by_type

xml_reader_router = FastMCP(name="XML Reader MCP")

Expand Down
35 changes: 35 additions & 0 deletions app/services/ch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import json
from dataclasses import dataclass
from json import JSONDecodeError
from pathlib import Path
from typing import Any

import chdb

from app.config import settings


@dataclass
class CHClient:
def __init__(self):
self.session = chdb.session.Session(settings.CH_DIRNAME)
self.db_name: str = settings.CH_DB_NAME
self.table_name: str = settings.CH_TABLE_NAME
self.path: Path = Path(settings.RAW_XML_PATH)

def __post_init__(self):
if not self.path.exists():
raise FileNotFoundError(f"XML file not found: {self.path}")
self.session.query(f"CREATE DATABASE IF NOT EXISTS {self.db_name}")

def inquire(self, query: str) -> dict[str, Any]:
"""
Makes an SQL query to the database
:return: result of the query
"""
# first call to json.loads() only returns a string, and the second one a dict
response: str = json.dumps(str(self.session.query(query, fmt='JSON')))
try:
return json.loads(json.loads(response))
except JSONDecodeError as e:
return {'error': str(e)}
Loading