A Vue 3 web application for the clinical interpretation of genetic variants from exome sequencing VCF files. It maps raw genomic variants onto protein structures so clinicians and researchers can visually assess pathogenicity, functional impact, and clinical relevance — through a single, modern clinical-genomics workspace.
Live demo · Published in Bioinformatics (2019) · Open-access full text (PMC) · Original instance (LIIGH-UNAM)
VCF/Plotein is a peer-reviewed clinical genomics tool I developed at the Cancer Genetics & Bioinformatics Lab, LIIGH-UNAM (Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México), Querétaro, Mexico. The accompanying research was published in Bioinformatics (Oxford University Press) in 2019.
Ossio, R., Garcia-Salinas, O.I., Anaya-Mancilla, D.S., Garcia-Sotelo, J.S., Aguilar, L.A., Adams, D.J., Robles-Espinoza, C.D. (2019). VCF/Plotein: visualization and prioritization of genomic variants from human exome sequencing projects. Bioinformatics, 35(22), 4803–4805. Oxford University Press. DOI: 10.1093/bioinformatics/btz458
The original tool was built in 2018–2019 on Nuxt 2 / Vue 2 / Webpack / node-sass / Bootstrap-Vue — a toolchain that no longer installs or builds on a current Node.js. This branch is later, independent work on the codebase: it first brought the project up to a present-day front-end stack, and then carried out a full UI/UX redesign on top of it.
- Nuxt 2 / Vue 2 → Vue 3 + Vite 5 — Composition API with
<script setup>; dev server cold-start ~0.2s. - Vuex → Pinia, Nuxt file-routing → Vue Router 4.
- node-sass → Tailwind CSS v4 — the UI chrome rebuilt with Tailwind; native Sass compilation removed.
- Bootstrap-Vue → hand-rolled Tailwind components (data table, pagination, file input).
- D3 v5 → D3 v7 — including the v6 event-handler API change across the lollipop renderer.
- Expired-certificate proxy for the companion backend — see Architecture highlights.
With the stack current, the interface was redesigned from the dated 2019-era layout into a modern, light clinical-genomics platform:
- Design system. A light theme — white cards on a soft canvas — with clinical semantic colors. Typography pairs Schibsted Grotesk for the UI with IBM Plex Mono for genomic data (positions, coordinates, amino-acid changes).
- Unified information architecture. The two disconnected pages (an upload wizard, then a separate graph page) were merged into one workspace shell: a top bar plus a persistent, collapsible inspector sidebar with accordion sections (dataset, transcript, consequence / protein-domain / clinical-database / sample filters, and bookmarks). The main area shows a gene browser when no gene is selected and the plot workspace once one is.
- Rewritten lollipop plot. The D3 plot now supports zoom and pan along the protein axis, hover tooltips, click-to-select a variant, domain hover, a redesigned database-presence track, and smooth transitions.
- Gene context that stays visible. The cramped readout strip became a gene header card with stat tiles, and variant detail moved into a docked side panel that keeps the gene in view while a variant is inspected.
- Responsive and stateful. The sidebar collapses into an overlay drawer on narrow viewports, and the app has proper empty and loading states throughout.
- Bug fix. A transcript-switching bug — caused by passing an object through a router query parameter — was fixed.
A later pass made the tool hold up under real-world data and tightened the interaction model:
- Streaming, worker-based VCF parsing. The file is read in slices and parsed in a Web Worker. Multi-gigabyte whole-genome VCFs — which previously overran a single JavaScript string and froze or crashed the tab — now parse off the main thread behind a live progress bar. The parsed variant index stays resident in the worker, so selecting a gene is a message round-trip rather than a full re-parse.
- Lighter bundle. The ~14 MB of gene-coordinate JSON moved out of the build into runtime fetches of plain
.json; only the genome build a dataset needs is downloaded. - Resilient annotation. Ensembl VEP is queried in bounded-concurrency batches with exponential-backoff retry on rate-limit responses, and companion-database lookups degrade gracefully when the upstream is unavailable.
- Variant navigator. The variant-detail panel became a persistent master-detail column — a click-friendly list of the variants in the plot when none is selected, the full detail when one is. Making the column permanent also removed a layout shift that nudged the plot and caused mis-clicks.
The Variant Call Format (VCF) is the standard file for storing DNA sequence variations—SNPs, insertions, deletions, and structural variants—generated by next-generation sequencing. A single exome can yield tens of thousands of variants. Identifying the handful that are clinically actionable requires integrating genomic coordinates with gene annotations, protein domains, population frequencies, and pathogenicity predictions. This tool automates that pipeline in the browser.
- Upload & parse — Accepts
.vcf,.vcf.gz, or saved.jsonbookmarks directly in the browser. Large files are streamed and parsed in a background worker, so even multi-gigabyte VCFs load without freezing the interface. - Gene extraction — Uses the reference genome (GRCh37/hg19 or GRCh38) to map variant positions to coding genes, then lists them in a searchable gene browser.
- Annotation — Queries the Ensembl VEP REST API to annotate consequences, amino-acid changes, protein domains, and transcript structures.
- Clinical cross-referencing — Checks variant presence in ClinVar, COSMIC, dbSNP, and gnomAD via a companion API.
- Pathogenicity scoring — Displays SIFT and PolyPhen predictions for missense variants.
- Protein visualization — Renders an interactive D3.js lollipop plot with zoom/pan along the protein axis, hover tooltips, click-to-select variants, and a database-presence track — variants mapped onto annotated protein domains.
- Filtering & exploration — Filter by transcript, consequence type, protein domain, sample, and clinical-database presence from the inspector sidebar; toggle between plot and tabular views.
- Export & bookmarks — Save sessions as JSON bookmarks, export tables as CSV, and download plots as SVG or PNG.
The whole flow lives in a single workspace: a top bar, a persistent inspector sidebar, and a main area that switches between the gene browser and the plot workspace.
- Framework: Vue 3 (Composition API,
<script setup>) - Build tool: Vite 5
- State & routing: Pinia, Vue Router 4
- Styling: Tailwind CSS v4 — light clinical design system; Schibsted Grotesk (UI) + IBM Plex Mono (genomic data)
- Visualization: D3.js v7 — interactive lollipop plot with zoom/pan, tooltips, selection, protein domains, and database-presence tracks
- Production server: zero-dependency Node.js server (
server/index.js) — static host + API proxy - Genomics utilities:
pako(gzip),@gmod/bgzf-filehandle, an interval-tree gene mapper
- Raw genomic data never leaves the browser. VCF and
.vcf.gzfiles are decompressed and parsed entirely client-side, in a Web Worker — a deliberate privacy choice, since exome data is PHI and clinical labs should not have to upload it to a third party. The parser streams the file in slices and decodes incrementally (with bgzf-aware gunzip for.vcf.gz), so a multi-gigabyte VCF never has to fit in a single string or array, and the parse stays off the UI thread. - Companion-backend proxy. The variant-database lookups (ClinVar/COSMIC/dbSNP/gnomAD) are served by a backend at LIIGH-UNAM whose TLS certificate has expired, which browsers refuse to call directly. The app instead requests a relative
/api/*path; both the Vite dev server and the production Node server (server/index.js) reverse-proxy those calls to the upstream, transparently bypassing the expired certificate. The browser only ever talks to a valid-certificate origin. - Gene reference data fetched on demand. The GRCh37 and GRCh38 gene-coordinate tables (~14 MB) are served as plain JSON and fetched at runtime — only the genome build a dataset actually needs is downloaded, parsed by the browser as data rather than evaluated as a JavaScript module, and kept out of the application bundle entirely.
- Interval-tree gene mapping. Variant positions are matched to coding genes via a lazily-built interval tree, with a priority queue assisting coordinate partitioning. This runs inside the parsing worker, so even a whole-genome VCF's millions of positions map without blocking the UI.
- D3 v7 lollipop rendering. Variants are drawn along the protein sequence and overlaid on annotated protein domains in a single coherent drawing pass that keeps zoom, transitions and selection state in sync. The plot supports zoom and pan along the protein axis, hover tooltips, click-to-select, domain hover, keyboard navigation, and SVG/PNG export — the latter resolving the design-system palette to literal colors so a standalone exported SVG renders correctly without a stylesheet.
npm install
npm run dev # Vite dev server with hot reload at localhost:3000
npm run build # production build → dist/
npm start # serve dist/ + proxy /api → node server/index.js
npm run lint # ESLintNo special toolchain is required — the project builds on current Node.js. In development, the Vite dev server proxies /api/* to the LIIGH-UNAM companion backend automatically; the Ensembl REST API is called directly.
Live on Railway at vcfplotein-production.up.railway.app. The production artifact is the dist/ build served by server/index.js — a zero-dependency Node server that serves the SPA and reverse-proxies /api/* to the companion backend (transparently bypassing the upstream's expired TLS certificate). Any Node host works:
- Build command:
npm run build - Start command:
npm start
The server reads PORT from the environment.
public/bap1-sample.vcf is a small BAP1 variant set for trying the upload flow. The built-in Demo button loads a pre-computed BAP1 dataset with no file needed.
The BAP1 demo dataset rendered as an interactive D3.js lollipop plot — variants mapped onto the protein's domains, with ClinVar / COSMIC / dbSNP / gnomAD presence tracks:
Precision medicine depends on turning raw sequencing data into interpretable insights. By visualizing how exome variants map to protein structure and combining them with clinical and pathogenicity databases, this tool reduces the manual burden on molecular biologists and supports faster, more informed clinical decisions.
The VCF/Plotein web application — its interface, data visualization and architecture — was designed and written by Diego Said Anaya Mancilla. The original 2018–2019 implementation and the 2026 ground-up rewrite to a modern Vue 3 / Vite stack (documented above) are both his work.
The application was developed at the Cancer Genetics & Bioinformatics Lab, LIIGH-UNAM (Querétaro, Mexico), where the surrounding research was carried out and published in Bioinformatics (2019) — see Published research. That publication is credited to the full lab team.
Released under the MIT License. Copyright 2018 Carla Daniela Robles Espinoza (LIIGH-UNAM).

