Home of the LanceDB documentation. Built using Mintlify.
Install the Mintlify CLI to preview the documentation changes locally. To install, use the following command
npm i -g mintlifyRun the following commands at the root of the documentation (/docs/ in this repo, where docs.json is located).
cd docs
mint devCheck broken links (applies to internal links within this docs site only):
mint broken-linksTo generate snippets, use uv to sync your local Python environment so that you can run the Python script described below.
uv syncThe Python, TypeScript and Rust code snippets used in the documentation are tested prior to use in the docs. These tests are located in the tests/ directory. Run the tests locally for each language
when building the docs locally.
MDX snippets are generated by running a separate script scripts/mdx_snippets_gen.py, as Mintlify cannot
scan the contents of raw code files -- it requires that the snippets are in MDX files under the
snippets directory.
A Makefile is provided with convience functions that run the snippet generation for each language:
# Generate snippets for each language, one by one
make py
make ts
make rs
# Or, generate them for all languages in one command
make snippetsThe generated snippets are placed in the appropriate file in /docs/snippets/ directory, making them
available for importing in the corresponding file.
The following sequence of steps are run:
- Run tests for py, ts, rs files that contain new code you added, and verify that the tests pass locally
- Generate MDX snippets via the
make snippetscommand - Import MDX snippets in the corresponding MDX docs page
- Include the MDX snippet as a parameter inside a
<CodeBlock>JSX component in Mintlify
Creating and using snippets for code blocks in the MDX files helps ensure that we are placing code that's been tested (per recent LanceDB releases) in the hands of users.
Note
As far as possible, do not add code snippets manually inside triple-backticks! Write the tests for
the required language in tests/* directory, then generate the snippets programmatically via the Makefile
commands.
The Datasets tab is populated from lance-format/lance-huggingface,
the master repository where each Lance dataset published under the lance-format
Hugging Face organization has its own directory with an HF_DATASET_CARD.md. That same file is what gets pushed to
the Hub as the dataset's README.md via the hf CLI, so the GitHub repo is the single source of truth for the
content of every dataset card.
To avoid maintaining the same content in two places, the per-dataset MDX pages under docs/datasets/ are
generated from those upstream cards via scripts/sync_hf_datasets.py. The script:
- Reads
scripts/hf_datasets.yaml, which lists every dataset to publish and maps the upstream directory name, the URL slug, the HF Hub repo, and the human-readable title. - Fetches each
HF_DATASET_CARD.mdfromlance-format/lance-huggingfaceon GitHub. - Rewrites the frontmatter for Mintlify (sets
title,sidebarTitle,description), strips the upstream H1, injects a "View on Hugging Face" card at the top, and sanitizes known MDX hazards (bibtex citations outside code fences, literal<>in prose). - Writes
docs/datasets/<slug>.mdx, regenerates the card grid indocs/datasets/index.mdxbetween theHF_SYNC:START/HF_SYNC:ENDmarkers, and updates theDatasetstab indocs/docs.jsonto keep the sidebar in sync.
Run it from the repo root:
make hf-sync- Author the new dataset's
HF_DATASET_CARD.mdupstream inlance-format/lance-huggingface(and push it to the Hub as usual). - Add a single line for the dataset under the appropriate category in
scripts/hf_datasets.yaml. The four fields (dir,slug,hf,title) are explicit because the GitHub directory name, the HF Hub repo slug, and the desired URL slug don't follow a derivable convention. - Run
make hf-sync. The script will fetch the new card, generatedocs/datasets/<slug>.mdx, refresh the landing-page card grid, and add the new page to theDatasetstab indocs/docs.json. - Preview locally with
mint devand commit the changes (the MDX page, the regeneratedindex.mdx, the updateddocs.json, and the new yaml entry).
If you remove a dataset from the yaml, the next make hf-sync will delete its MDX file and drop the sidebar
entry. The script hard-fails on any fetch error — partial regeneration would be worse than a clear error.