Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
fb1871e
feat: foundation - kherud fork (Tier 0) + repo scaffolding (Tier 1)
aksOps May 8, 2026
61afd38
checkpoint: pre-yolo 2026-05-09T00:23:00
aksOps May 9, 2026
2025bdf
checkpoint: pre-yolo 2026-05-09T00:34:13
aksOps May 9, 2026
9ef8088
checkpoint: pre-yolo 2026-05-09T00:41:21
aksOps May 9, 2026
a9e5304
fix(scripts): drop extraneous f prefix on manifest header (ruff F541)
aksOps May 9, 2026
8274ba9
style(inference-sdk-embed): apply google-java-format via spotless
aksOps May 9, 2026
61b023a
fix(java): drop empty inference-sdk-embed-bge-small from reactor
aksOps May 9, 2026
3119e23
fix(ci): point java-ci at java/pom.xml + use full plugin coords
aksOps May 9, 2026
a00e952
refactor: drop kherud-fork; consume published de.kherud:llama:4.2.0 (…
aksOps May 9, 2026
cf1de32
feat(generate,qwen): Tier 4.A generate module + Tier 4.B Qwen JAR shell
aksOps May 9, 2026
058f4be
feat(bundle): Tier 4.C inference-sdk-bundle fat JAR
aksOps May 9, 2026
b1b7d80
feat(embed): add inference-sdk-embed-bge-small JAR shell + bundle dep
aksOps May 9, 2026
2e3d009
feat(tests,examples): Tier 5 - integration tests + quickstart + LIBRA…
aksOps May 9, 2026
d294738
feat(build): hybrid model-distribution - fetch at build, embed in JAR
aksOps May 9, 2026
5b75295
chore(docs,build): hybrid retrofit stragglers - CONTRIBUTING + qwen pom
aksOps May 9, 2026
65eed55
checkpoint: pre-yolo 2026-05-09T01:58:26
aksOps May 9, 2026
81225e3
fix(ci): mark NativeLibLoaderTest fixtures binary; restore corrupted …
aksOps May 9, 2026
a7cd6fe
fix(ci): split PR verify (fast, fetch-skipped) from package-artifacts
aksOps May 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
root = true

[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
indent_style = space
indent_size = 2

[*.java]
indent_size = 4

[*.go]
indent_style = tab

[Makefile]
indent_style = tab

[*.md]
trim_trailing_whitespace = false
15 changes: 8 additions & 7 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -33,20 +33,21 @@ CODEOWNERS text eol=lf
mvnw text eol=lf
mvnw.cmd text eol=crlf

# --- Model artifacts (Git LFS) ---
*.onnx filter=lfs diff=lfs merge=lfs -text
*.gguf filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
# --- Model artifacts ---
# Model weights are NOT committed to source — fetched at build time by
# scripts/fetch_models.py and embedded in the published Maven JAR. See
# docs/ARCHITECTURE.md §3.1 (model JAR layout) and CONTRIBUTING.md
# (Air-gapped contributors). No Git LFS.

# --- Native binary libraries (not LFS — small and version-coupled to source) ---
*.dll binary
*.so binary
*.dylib binary
*.lib binary
*.a binary
*.bin binary
*.onnx binary
*.gguf binary

# --- Images / fonts ---
*.png binary
Expand Down
11 changes: 11 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# inference-sdk — code ownership
# https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
#
# Default owner — every file requires aksOps review unless overridden below.
* @aksOps

# Forked native binding (Tier 0) — high-blast-radius, security-sensitive
native/kherud-fork/ @aksOps

# CI workflows + GitHub config
.github/ @aksOps
51 changes: 51 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
name: Bug report
about: Report a defect in inference-sdk (Java)
title: "bug: <short summary>"
labels: ["bug", "triage"]
assignees: []
---

<!-- Before filing: search existing issues. If this is a security report,
STOP and use the private channel in SECURITY.md instead. -->

## Summary

<!-- One line. What broke? -->

## Reproducible example

<!-- Smallest possible snippet that triggers the bug. Include any
model artifact paths, env vars, or builder configuration. -->

```java
// Embedder / Generator setup, inputs, expected vs actual
```

## Expected behavior

<!-- What should have happened? -->

## Actual behavior

<!-- What actually happened? Include stack traces, exception types,
and log lines (slf4j logger name + level). -->

```
<stack trace / error output>
```

## Environment

- inference-sdk version:
- JDK version (`java -version`):
- OS / kernel / arch (`uname -a`):
- glibc version (`ldd --version`):
- Maven version (`./mvnw -version`):
- Container / VM (yes/no, image):
- Model artifact id and SHA-256:

## Additional context

<!-- Anything else: thread dump, JFR file, reproduction frequency,
workarounds tried. -->
13 changes: 13 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
blank_issues_enabled: false
contact_links:
- name: Security vulnerability
url: https://github.com/RandomCodeSpace/inference-sdk/security/advisories/new
about: >-
Report security issues PRIVATELY via GitHub Security Advisories.
Do NOT open a public issue. See SECURITY.md for the full policy.
- name: Discussions / questions
url: https://github.com/RandomCodeSpace/inference-sdk/discussions
about: >-
Open-ended questions, integration help, design discussion. For
reproducible bugs and concrete feature requests, please use the
issue templates instead.
42 changes: 42 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
name: Feature request
about: Suggest a new capability for inference-sdk
title: "feat: <short summary>"
labels: ["enhancement", "triage"]
assignees: []
---

## Problem

<!-- What user need is unmet today? Concrete scenario, not abstract wish. -->

## Proposed solution

<!-- What would the API / behavior look like? Include code sketches. -->

```java
// Builder change, new method, new record, new exception, etc.
```

## Alternatives considered

<!-- What other approaches did you weigh? Why is your proposal better? -->

## Compatibility

- [ ] This change is fully backward compatible
- [ ] This change requires a major version bump (breaks public API)
- [ ] This change affects the wire format (`docs/WIRE_FORMAT.md`)
- [ ] This change affects the model registry (`docs/MODEL_REGISTRY.md`)

## Phase fit

- [ ] Phase 1 (library only)
- [ ] Phase 1.5 (additional native targets / models)
- [ ] Phase 2 (HTTP layer / OpenAI-compatible)
- [ ] Future / unscheduled

## Additional context

<!-- Links, prior art (Linear / Supabase / Raycast / Notion / Apple
analogs), benchmarks, related issues. -->
45 changes: 45 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
<!--
Thanks for opening a PR. Please fill out every section. Boxes left
unchecked block merge; the reviewer will not unblock for you.
-->

## Summary

<!-- 1–3 lines: what changed and why. Link the design doc / spec line if applicable. -->

## Design / spec reference

<!-- Link to the relevant section of `java-sdk.md`, `docs/superpowers/specs/...`, or the
plan tier (`Tier N task T<n>.<m>`). If this PR diverges from the spec, explain. -->

## Type of change

- [ ] feat — new user-facing capability
- [ ] fix — bug fix
- [ ] refactor — internal restructuring, no behavior change
- [ ] chore — build / CI / tooling
- [ ] docs — documentation only
- [ ] test — test-only change

## Verification checklist

- [ ] `./mvnw -B verify` passes locally
- [ ] `./mvnw dependency-check:check` clean (no CVSS >= 7)
- [ ] `./mvnw spotless:check spotbugs:check` clean
- [ ] New code has unit tests; integration tests where boundaries are crossed
- [ ] JaCoCo line >= 75% / branch >= 70% on touched modules
- [ ] JavaDoc on every public type / method introduced
- [ ] No runtime network calls added (offline guarantee preserved)
- [ ] No new `*.md` agent-generated artifacts committed (planning, scratch, etc.)
- [ ] No force-pushes to shared branches; no direct commits to `main`

## Forward-compat (Phase 1 -> Phase 2)

- [ ] Reserved fields (`Message.toolCalls/toolCallId/name`,
`GenerateRequest.tools/toolChoice/responseFormat`) still throw
`FeatureNotSupportedException` when non-null
- [ ] Wire format unchanged or `docs/WIRE_FORMAT.md` updated in lockstep

## Notes for reviewer

<!-- Anything tricky, deferred, or worth a closer look. -->
74 changes: 74 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Dependabot — keep CI actions, Maven deps, and Python script deps in sync.
# Spec §0 + §10.
#
# Ecosystems supported: github-actions, maven, pip.
# Standard Maven dependabot at `/java` will surface `de.kherud:llama` version
# bumps naturally — no separate watcher needed.
version: 2

updates:
# ----- GitHub Actions -----
- package-ecosystem: github-actions
directory: "/"
schedule:
interval: daily
open-pull-requests-limit: 5
labels:
- dependencies
- github-actions
commit-message:
prefix: "chore(actions)"
include: scope

# ----- Maven (Java SDK) -----
- package-ecosystem: maven
directory: "/java"
schedule:
interval: weekly
day: monday
open-pull-requests-limit: 10
labels:
- dependencies
- java
commit-message:
prefix: "chore(deps)"
include: scope
# Group safe semver bumps to reduce PR noise.
groups:
java-test-deps:
patterns:
- "org.junit*"
- "org.assertj*"
- "org.mockito*"
- "org.awaitility*"
- "nl.jqno.equalsverifier*"
- "net.jqwik*"
- "ch.qos.logback*"
update-types:
- patch
- minor
maven-plugins:
patterns:
- "org.apache.maven.plugins*"
- "com.diffplug.spotless*"
- "com.github.spotbugs*"
- "org.jacoco*"
- "org.codehaus.mojo*"
- "org.owasp*"
update-types:
- patch
- minor

# ----- Python (scripts) -----
- package-ecosystem: pip
directory: "/scripts"
schedule:
interval: weekly
day: monday
open-pull-requests-limit: 5
labels:
- dependencies
- python
commit-message:
prefix: "chore(scripts)"
include: scope
Loading
Loading