Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 78 additions & 11 deletions app/mcp/v1/tools/ch_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
get_statistics_by_type_from_ch,
get_trend_data_from_ch,
search_health_records_from_ch,
search_values_from_ch,
)

ch_reader_router = FastMCP(name="CH Reader MCP")
Expand All @@ -22,8 +23,15 @@ def get_health_summary_ch() -> dict[str, Any]:

Notes for LLM:
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return get_health_summary_from_ch()
Expand All @@ -44,14 +52,21 @@ def search_health_records_ch(params: HealthRecordSearchParams) -> dict[str, Any]
matching the search criteria.
- Each document in the list should represent a single health record as stored in ClickHouse.
- If an error occurs, the function should return a list with a single dict
containing an 'error' key and the error message.
containing an 'error' key and the error message.
- Use this to retrieve structured health data for further analysis, filtering, or display.
- Example source_name: "Rob’s iPhone", "Polar Flow", "Sync Solver".
- Example date_from/date_to: "2020-01-01T00:00:00+00:00"
- Example value_min/value_max: "10", "100.5"
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return search_health_records_from_ch(params)
Expand Down Expand Up @@ -85,16 +100,23 @@ def get_statistics_by_type_ch(record_type: RecordType | str) -> dict[str, Any]:
max, avg, sum) for the 'value' field.
- The sources breakdown shows which devices/apps contributed data for this record type.
- Example types: "HKQuantityTypeIdentifierStepCount",
"HKQuantityTypeIdentifierBodyMassIndex", "HKQuantityTypeIdentifierHeartRate", etc.
"HKQuantityTypeIdentifierBodyMassIndex", "HKQuantityTypeIdentifierHeartRate", etc.
- Use this function to understand the distribution, range, and trends of
specific health metrics.
- The function is useful for health analysis, identifying outliers, and
understanding data quality.
- date_range key for query is commented, since it contained hardcoded from
date, but you can use it anyway if you replace startDate with your data.
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return get_statistics_by_type_from_ch(record_type)
Expand All @@ -116,7 +138,7 @@ def get_trend_data_ch(
Parameters:
- record_type: The type of health record to analyze (e.g., "HKQuantityTypeIdentifierStepCount")
- interval: Time interval for aggregation.
- date_from, date_to: Optional ISO8601 date strings for filtering date range
- date_from, date_to: Optional date strings for filtering date range

Returns:
- record_type: The analyzed record type
Expand All @@ -134,10 +156,55 @@ def get_trend_data_ch(
- IMPORTANT - interval must be one of: "day", "week", "month", or "year".
Do not use other values.
- Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return get_trend_data_from_ch(record_type, interval, date_from, date_to)
except Exception as e:
return {"error": f"Failed to get trend data: {str(e)}"}


@ch_reader_router.tool
def search_values_ch(
record_type: RecordType | str | None,
value: str,
date_from: str | None = None,
date_to: str | None = None,
) -> dict[str, Any]:
"""
Search for records with exactly matching values (including text) using ClickHouse.

Parameters:
- record_type: The type of health record to analyze (e.g., "HKQuantityTypeIdentifierStepCount")
- value: Value to search for in the data
- date_from, date_to: Optional date strings for filtering date range

Notes for LLMs:
- Use this to search for specific values (for example statistical outliers) in health data
- It can also be used for text values: e.g.
you can search for "HKCategoryTypeIdentifierSleepAnalysis"
records with the value of "HKCategoryValueSleepAnalysisAsleepDeep"
- The function automatically handles date filtering if date_from/date_to are provided
- Do not guess, autofill, or assume any missing data.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return search_values_from_ch(record_type, value, date_from, date_to)
except Exception as e:
return {"error": f"Failed to search for values: {str(e)}"}
89 changes: 78 additions & 11 deletions app/mcp/v1/tools/duckdb_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
get_statistics_by_type_from_duckdb,
get_trend_data_from_duckdb,
search_health_records_from_duckdb,
search_values_from_duckdb,
)

duckdb_reader_router = FastMCP(name="CH Reader MCP")
Expand All @@ -18,12 +19,19 @@ def get_health_summary_duckdb() -> list[dict[str, Any]]:
"""
Get a summary of Apple Health data from DuckDB.
The function returns total record count, record type breakdown, and
(optionally) a date range aggregation.
(optionally) a date range aggregation.

Notes for LLM:
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return get_health_summary_from_duckdb()
Expand All @@ -50,8 +58,15 @@ def search_health_records_duckdb(params: HealthRecordSearchParams) -> list[dict[
- Example date_from/date_to: "2020-01-01T00:00:00+00:00"
- Example value_min/value_max: "10", "100.5"
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return search_health_records_from_duckdb(params)
Expand Down Expand Up @@ -87,14 +102,21 @@ def get_statistics_by_type_duckdb(record_type: RecordType | str) -> list[dict[st
- Example types: "HKQuantityTypeIdentifierStepCount",
"HKQuantityTypeIdentifierBodyMassIndex", "HKQuantityTypeIdentifierHeartRate", etc.
- Use this function to understand the distribution, range, and trends of
specific health metrics.
specific health metrics.
- The function is useful for health analysis, identifying outliers, and
understanding data quality.
- date_range key for query is commented, since it contained hardcoded from
date, but you can use it anyway if you replace startDate with your data.
- IMPORTANT - Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return get_statistics_by_type_from_duckdb(record_type)
Expand All @@ -111,7 +133,7 @@ def get_trend_data_duckdb(
) -> list[dict[str, Any]]:
"""
Get trend data for a specific health record type over time using DuckDB
date histogram aggregation.
date histogram aggregation.

Parameters:
- record_type: The type of health record to analyze (e.g., "HKQuantityTypeIdentifierStepCount")
Expand All @@ -134,10 +156,55 @@ def get_trend_data_duckdb(
- IMPORTANT - interval must be one of: "day", "week", "month", or "year".
Do not use other values.
- Do not guess, autofill, or assume any missing data.
- When asked for medical advice, ask the user whether he wants to use DuckDB, ClickHouse or
Elasticsearch.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return get_trend_data_from_duckdb(record_type, interval, date_from, date_to)
except Exception as e:
return [{"error": f"Failed to get trend data: {str(e)}"}]


@duckdb_reader_router.tool
def search_values_duckdb(
record_type: RecordType | str | None,
value: str,
date_from: str | None = None,
date_to: str | None = None,
) -> list[dict[str, Any]]:
"""
Search for records with exactly matching values (including text) using DuckDB.

Parameters:
- record_type: The type of health record to analyze (e.g., "HKQuantityTypeIdentifierStepCount")
- value: Value to search for in the data
- date_from, date_to: Optional ISO8601 date strings for filtering date range

Notes for LLMs:
- Use this to search for specific values (for example statistical outliers) in health data
- It can also be used for text values: e.g.
you can search for "HKCategoryTypeIdentifierSleepAnalysis"
records with the value of "HKCategoryValueSleepAnalysisAsleepDeep"
- The function automatically handles date filtering if date_from/date_to are provided
- Do not guess, autofill, or assume any missing data.
- If there are multiple databases available (DuckDB, ClickHouse, Elasticsearch):
first, ask the user which one he wants to use. DO NOT call any tools before
the user specifies his intent.
- If the user decides on an option, only use tools from this database,
do not switch over to another until the user specifies that he wants
to use a different one. You do not have to keep asking whether
the user wants to use the same database that he used before.
- If there is only one database available (DuckDB, ClickHouse, Elasticsearch):
you can use the tools from this database without the user specifying it.
"""
try:
return search_values_from_duckdb(record_type, value, date_from, date_to)
except Exception as e:
return [{"error": f"Failed to get trend data: {str(e)}"}]
Loading