Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,7 @@ MigrationBackup/
cscope*

build/
build-*/
build_linux/
!.github/actions/build

Expand Down
123 changes: 123 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,129 @@ sudo sh l_BaseKit_p_2022.1.2.146.sh -a --components intel.oneapi.lin.mkl.devel -
mkdir build && cd build && cmake -DCMAKE_BUILD_TYPE=Release .. && make -j
```

### AVX-512 BF16 (optional acceleration)

DiskANN includes an optional AVX-512 BF16-accelerated kernel for `bf16` distance computations.

- Compile-time: the AVX-512 BF16 kernel is enabled only when the compiler supports the required flags; it is compiled for a single source file (`src/bf16_simd_kernels.cpp`) so the rest of the project is not forced to use AVX-512.
- Runtime: `bf16` distance code automatically dispatches to the AVX-512 BF16 kernel only when the running CPU/OS supports AVX-512 BF16; otherwise it falls back to the scalar implementation.

You can control this with the following CMake options (non-MSVC builds):

- Default (try to enable when supported):
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DDISKANN_AVX512BF16=ON
cmake --build build -j
```
- Force disable:
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DDISKANN_AVX512BF16=OFF
cmake --build build -j
```
- Force enable (fail configure if compiler does not support AVX-512 BF16 flags):
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DDISKANN_FORCE_AVX512BF16=ON
cmake --build build -j
```

### AMX BF16 (optional acceleration)

DiskANN includes an optional AMX BF16-accelerated kernel for `bf16` inner-product computations.

- Compile-time: the AMX BF16 kernel is enabled only when the compiler supports the required flags; it is compiled for a single source file (`src/bf16_amx_kernels.cpp`) so the rest of the project is not forced to use AMX.
- Runtime: `bf16` distance code automatically dispatches to the AMX kernel only when the running CPU/OS supports AMX and the current thread is permitted to use AMX tile state (Linux `arch_prctl` request). If unavailable, it falls back to AVX-512 BF16 (if enabled) and then scalar.

You can control this with the following CMake options (non-MSVC builds):

- Default (try to enable when supported):
```bash
cmake -S . -B build-amx -DCMAKE_BUILD_TYPE=Release -DDISKANN_AMXBF16=ON
cmake --build build-amx -j
```
- Force disable:
```bash
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DDISKANN_AMXBF16=OFF
cmake --build build -j
```
- Force enable (fail configure if compiler does not support AMX flags):
```bash
cmake -S . -B build-amx -DCMAKE_BUILD_TYPE=Release -DDISKANN_FORCE_AMXBF16=ON
cmake --build build-amx -j
```

### AVX-512 vs AMX (build one or the other)

If you want to do a strict A/B build where only one ISA path is compiled/used, configure two separate build directories.

- AVX-512 BF16 only (no AMX codegen):
```bash
cmake -S . -B build-avx512 -DCMAKE_BUILD_TYPE=Release \
-DDISKANN_FORCE_AVX512BF16=ON \
-DDISKANN_AMXBF16=OFF
cmake --build build-avx512 -j
```

- AMX BF16 only (no AVX-512 BF16 code path):
```bash
cmake -S . -B build-amx -DCMAKE_BUILD_TYPE=Release \
-DDISKANN_FORCE_AMXBF16=ON \
-DDISKANN_AVX512BF16=OFF
cmake --build build-amx -j
```

Note: some toolchains/build scripts add global `-march=native`. When AMX is disabled (`-DDISKANN_AMXBF16=OFF`), DiskANN explicitly compiles the AMX translation unit with `-mno-amx-tile`/`-mno-amx-bf16` (when supported) to avoid accidentally emitting AMX instructions.

### RaBitQ main-search approximate scoring (optional, runtime-gated)

DiskANN also supports using RaBitQ multi-bit codes as the *main traversal approximate scorer* in SSD search (inside `PQFlashIndex::cached_beam_search`).

- Default behavior is unchanged: traversal uses the existing PQ distance lookup.
- When enabled, traversal scoring uses RaBitQ approximate inner product (converted to a “distance” as `-ip`) while keeping the rest of the search logic intact.

#### Runtime enable

Set:

```bash
export DISKANN_USE_RABITQ_MAIN_APPROX=1
```

If the environment variable is set but RaBitQ main codes are missing or incompatible, DiskANN prints a one-time message and automatically falls back to PQ.

#### Main code file naming

Preferred sidecar file name:

```text
<index_filepath>_rabitq_main.bin
```

For example, if your SSD index file is `foo_disk.index`, the RaBitQ main code file should be `foo_disk.index_rabitq_main.bin`.

#### Generating main codes during disk index build

You can generate the main-search sidecar automatically as part of disk index build:

```bash
./build/apps/build_disk_index \
... \
--dist_fn mips \
--build_rabitq_main_codes \
--rabitq_nb_bits 4
```

This produces:

```text
<index_path_prefix>_disk.index_rabitq_main.bin
```

#### Constraints

- Currently supported only for `dist_fn=mips` / `Metric::INNER_PRODUCT`.
- The RaBitQ code `dim` must match the index `_data_dim` (post any preprocessing/augmentation), otherwise main-search RaBitQ is disabled and the search falls back to PQ.
- Ensure you run the updated `search_disk_index`/`build_disk_index` binaries from the same build directory that contains this feature.

## Windows build:

The Windows version has been tested with Enterprise editions of Visual Studio 2022, 2019 and 2017. It should work with the Community and Professional editions as well without any changes.
Expand Down
1 change: 1 addition & 0 deletions apps/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ target_link_libraries(search_disk_index ${PROJECT_NAME} ${DISKANN_ASYNC_LIB} ${D
add_executable(range_search_disk_index range_search_disk_index.cpp)
target_link_libraries(range_search_disk_index ${PROJECT_NAME} ${DISKANN_ASYNC_LIB} ${DISKANN_TOOLS_TCMALLOC_LINK_OPTIONS} Boost::program_options)


add_executable(test_streaming_scenario test_streaming_scenario.cpp)
target_link_libraries(test_streaming_scenario ${PROJECT_NAME} ${DISKANN_TOOLS_TCMALLOC_LINK_OPTIONS} Boost::program_options)

Expand Down
88 changes: 85 additions & 3 deletions apps/build_disk_index.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,63 @@
#include "index.h"
#include "partition.h"
#include "program_options_utils.hpp"
#include "bfloat16.h"

namespace po = boost::program_options;

static int convert_bf16_bin_to_f32_bin(const std::string &bf16_path, const std::string &f32_path)
{
std::ifstream reader(bf16_path, std::ios::binary);
if (!reader)
{
diskann::cerr << "Error: could not open input file " << bf16_path << std::endl;
return -1;
}
std::ofstream writer(f32_path, std::ios::binary);
if (!writer)
{
diskann::cerr << "Error: could not open output file " << f32_path << std::endl;
return -1;
}

uint32_t npts = 0, dim = 0;
reader.read(reinterpret_cast<char *>(&npts), sizeof(uint32_t));
reader.read(reinterpret_cast<char *>(&dim), sizeof(uint32_t));
if (!reader)
{
diskann::cerr << "Error: failed to read header from " << bf16_path << std::endl;
return -1;
}
writer.write(reinterpret_cast<const char *>(&npts), sizeof(uint32_t));
writer.write(reinterpret_cast<const char *>(&dim), sizeof(uint32_t));

constexpr size_t kBlockElems = 1u << 20; // 1M elements (~2MB bf16, ~4MB float)
std::vector<diskann::bfloat16> in_buf;
std::vector<float> out_buf;
in_buf.resize(kBlockElems);
out_buf.resize(kBlockElems);

const uint64_t total_elems = static_cast<uint64_t>(npts) * static_cast<uint64_t>(dim);
uint64_t done = 0;
while (done < total_elems)
{
const size_t this_block = static_cast<size_t>(std::min<uint64_t>(kBlockElems, total_elems - done));
reader.read(reinterpret_cast<char *>(in_buf.data()), this_block * sizeof(diskann::bfloat16));
if (!reader)
{
diskann::cerr << "Error: failed reading bf16 payload from " << bf16_path << std::endl;
return -1;
}
for (size_t i = 0; i < this_block; i++)
{
out_buf[i] = static_cast<float>(in_buf[i]);
}
writer.write(reinterpret_cast<const char *>(out_buf.data()), this_block * sizeof(float));
done += this_block;
}
return 0;
}

int main(int argc, char **argv)
{
std::string data_type, dist_fn, data_path, index_path_prefix, codebook_prefix, label_file, universal_label,
Expand All @@ -21,6 +75,8 @@ int main(int argc, char **argv)
float B, M;
bool append_reorder_data = false;
bool use_opq = false;
bool build_rabitq_main_codes = false;
uint32_t rabitq_nb_bits = 4;

po::options_description desc{
program_options_utils::make_program_description("build_disk_index", "Build a disk-based index.")};
Expand Down Expand Up @@ -63,6 +119,14 @@ int main(int argc, char **argv)
optional_configs.add_options()("append_reorder_data", po::bool_switch()->default_value(false),
"Include full precision data in the index. Use only in "
"conjuction with compressed data on SSD.");

optional_configs.add_options()(
"build_rabitq_main_codes", po::bool_switch()->default_value(false),
"Generate RaBitQ main-search codes sidecar file (<index>_disk.index_rabitq_main.bin). "
"Only meaningful for dist_fn=mips.");

optional_configs.add_options()("rabitq_nb_bits", po::value<uint32_t>(&rabitq_nb_bits)->default_value(4),
"Bits per dimension for RaBitQ codes (1..9)");
optional_configs.add_options()("build_PQ_bytes", po::value<uint32_t>(&build_PQ)->default_value(0),
program_options_utils::BUIlD_GRAPH_PQ_BYTES);
optional_configs.add_options()("use_opq", po::bool_switch()->default_value(false),
Expand Down Expand Up @@ -94,6 +158,8 @@ int main(int argc, char **argv)
append_reorder_data = true;
if (vm["use_opq"].as<bool>())
use_opq = true;
if (vm["build_rabitq_main_codes"].as<bool>())
build_rabitq_main_codes = true;
}
catch (const std::exception &ex)
{
Expand Down Expand Up @@ -124,10 +190,11 @@ int main(int argc, char **argv)
<< std::endl;
return -1;
}
if (data_type != std::string("float"))
if (data_type != std::string("float") && data_type != std::string("bf16") &&
data_type != std::string("bfloat16"))
{
std::cout << "Error: Appending data for reordering currently only "
"supported for float data type."
"supported for float/bf16 data type."
<< std::endl;
return -1;
}
Expand All @@ -137,7 +204,9 @@ int main(int argc, char **argv)
std::string(std::to_string(B)) + " " + std::string(std::to_string(M)) + " " +
std::string(std::to_string(num_threads)) + " " + std::string(std::to_string(disk_PQ)) + " " +
std::string(std::to_string(append_reorder_data)) + " " +
std::string(std::to_string(build_PQ)) + " " + std::string(std::to_string(QD));
std::string(std::to_string(build_PQ)) + " " + std::string(std::to_string(QD)) + " " +
std::string(std::to_string(build_rabitq_main_codes)) + " " +
std::string(std::to_string(rabitq_nb_bits));

try
{
Expand All @@ -155,6 +224,12 @@ int main(int argc, char **argv)
return diskann::build_disk_index<float, uint16_t>(
data_path.c_str(), index_path_prefix.c_str(), params.c_str(), metric, use_opq, codebook_prefix,
use_filters, label_file, universal_label, filter_threshold, Lf);
else if (data_type == std::string("bf16") || data_type == std::string("bfloat16"))
{
return diskann::build_disk_index<diskann::bfloat16, uint16_t>(
data_path.c_str(), index_path_prefix.c_str(), params.c_str(), metric, use_opq, codebook_prefix,
use_filters, label_file, universal_label, filter_threshold, Lf);
}
else
{
diskann::cerr << "Error. Unsupported data type" << std::endl;
Expand All @@ -175,6 +250,13 @@ int main(int argc, char **argv)
return diskann::build_disk_index<float>(data_path.c_str(), index_path_prefix.c_str(), params.c_str(),
metric, use_opq, codebook_prefix, use_filters, label_file,
universal_label, filter_threshold, Lf);
else if (data_type == std::string("bf16") || data_type == std::string("bfloat16"))
{
return diskann::build_disk_index<diskann::bfloat16>(data_path.c_str(), index_path_prefix.c_str(),
params.c_str(), metric, use_opq, codebook_prefix,
use_filters, label_file, universal_label,
filter_threshold, Lf);
}
else
{
diskann::cerr << "Error. Unsupported data type" << std::endl;
Expand Down
25 changes: 25 additions & 0 deletions apps/build_rabitq_reorder_codes.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#include <boost/program_options.hpp>

#include <cassert>
#include <cstdint>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>

#include "rabitq.h"
#include "utils.h"

namespace po = boost::program_options;

namespace
{
#pragma pack(push, 1)
struct RaBitQReorderHeader
{
char magic[8];
uint32_t version;
uint32_t metric;
uint32_t nb_bits;
#error "build_rabitq_reorder_codes has been removed (RaBitQ reorder prefilter deprecated). Use build_disk_index with --build_rabitq_main_codes instead."
uint64_t num_points;
Loading