Skip to content

feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy#5124

Open
PG1204 wants to merge 1 commit into
apache:mainfrom
ELin2025:hf/01-backend-skeleton
Open

feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy#5124
PG1204 wants to merge 1 commit into
apache:mainfrom
ELin2025:hf/01-backend-skeleton

Conversation

@PG1204
Copy link
Copy Markdown
Contributor

@PG1204 PG1204 commented May 17, 2026

Summary

Related to issue #5041

First PR in a stacked series landing the HuggingFace operator end-to-end. This PR adds only the backend REST resource — no operator code yet. The resource is independently useful (the frontend can already integrate with it) and lets reviewers absorb the API surface before the operator class lands.

What's changes were proposed in this PR?

Introduces HuggingFaceModelResource, a Jersey resource registered at /api/huggingface/*:

Endpoint Purpose
GET /api/huggingface/models Browse / search models per task. Uses an in-process cache for browse mode and forwards to HF Hub for search.
GET /api/huggingface/tasks Fetch HF pipeline tags filtered to tasks with hosted inference. Cached process-lifetime.
POST /api/huggingface/upload-audio Upload audio bytes for HF audio tasks; stores in /tmp/texera-hf-audio/ and returns the file path.
GET /api/huggingface/audio-preview Stream uploaded audio (path-validated to prevent traversal).
GET /api/huggingface/media-proxy Proxy remote media URLs (HF inference responses) to bypass browser CORS.

Also a one-line registration in TexeraWebApplication.scala.

Stacked PR plan

This is PR 1 of ~9. Subsequent PRs will add:

  • PR 2: refactored HuggingFaceInferenceOpDesc skeleton + text-generation codegen
  • PRs 3–5: per-task-family codegen (image, audio + media-gen, QA/ranking)
  • PRs 6–8: frontend (task/model selector, property-editor visibility, result-panel media)
  • PR 9: developer docs

Test plan

  • sbt "amber/test" passes locally
  • Hit GET /api/huggingface/tasks and confirm JSON list of supported tasks
  • Hit GET /api/huggingface/models?task=text-generation and confirm paginated model list
  • POST /api/huggingface/upload-audio with a small WAV, then fetch via /api/huggingface/audio-preview and confirm the bytes match
  • GET /api/huggingface/media-proxy?url=… with a known HF inference response URL and confirm the response is streamed

Authored / co-authored using generative AI tooling?

Co-authored with Claude Opus 4.7

…d media proxy

Introduces a new Jersey REST resource exposing endpoints used by the
upcoming HuggingFace operator UI:

- GET  /api/huggingface/models       — browse / search models per task
- GET  /api/huggingface/tasks        — list HF pipeline tags with hosted inference
- POST /api/huggingface/upload-audio — upload audio for HF audio tasks
- GET  /api/huggingface/audio-preview — stream uploaded audio (path-validated)
- GET  /api/huggingface/media-proxy   — proxy remote media URLs to bypass CORS

This is the first PR in a stacked series landing the HF operator end-to-end.
No operator code yet; this resource is independently useful and lets the
frontend integrate with HF before the operator class lands.
@PG1204
Copy link
Copy Markdown
Contributor Author

PG1204 commented May 17, 2026

/request-review @Ma77Ball

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 17, 2026

Codecov Report

❌ Patch coverage is 0% with 215 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.95%. Comparing base (bfa79a7) to head (78633de).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...texera/web/resource/HuggingFaceModelResource.scala 0.00% 214 Missing ⚠️
...a/org/apache/texera/web/TexeraWebApplication.scala 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5124      +/-   ##
============================================
- Coverage     43.17%   42.95%   -0.23%     
- Complexity     2209     2212       +3     
============================================
  Files          1045     1046       +1     
  Lines         40254    40469     +215     
  Branches       4250     4288      +38     
============================================
+ Hits          17380    17383       +3     
- Misses        21804    22016     +212     
  Partials       1070     1070              
Flag Coverage Δ *Carryforward flag
access-control-service 39.53% <ø> (ø) Carriedforward from bfa79a7
agent-service 33.72% <ø> (ø) Carriedforward from bfa79a7
amber 43.28% <0.00%> (-0.58%) ⬇️
computing-unit-managing-service 0.00% <ø> (ø) Carriedforward from bfa79a7
config-service 0.00% <ø> (ø) Carriedforward from bfa79a7
file-service 32.18% <ø> (ø) Carriedforward from bfa79a7
frontend 34.05% <ø> (ø) Carriedforward from bfa79a7
python 90.43% <ø> (ø) Carriedforward from bfa79a7
workflow-compiling-service 56.81% <ø> (ø) Carriedforward from bfa79a7

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Yicong-Huang
Copy link
Copy Markdown
Contributor

@PG1204 Thanks for opening this PR! Please do the following:

  1. please follow our PR template and make the description concise.
  2. please make sure your code meets the test coverage.
  3. please use issues to describe future plans such as stacked PRs. This is because each PR after merge will become immutable. Issues can hold information that is longer than a PR's life cycle, and can subject to updates. If you are planning for opening multiple PRs, I suggest you use an umbrella issue to contain multiple sub issues, each for one PR.
  4. you can use /request-review @xxx to request reviewer for review.

@PG1204
Copy link
Copy Markdown
Contributor Author

PG1204 commented May 18, 2026

@Yicong-Huang

Thank you for the suggestions. Will update the PR accordingly.

@Ma77Ball
Copy link
Copy Markdown
Contributor

Hi @PG1204, while I begin my review, please address @Yicong-Huang's feedback. Specifically:

  1. Update the PR description to follow this template exactly:
   ### What changes were proposed in this PR?
   ...
   ### Any related issues, documentation, or discussions?
   ...
   ### How was this PR tested?
   ...
   ### Was this PR authored or co-authored using generative AI tooling?
   ...
  1. Add test coverage for as much of the new code as possible. At a minimum, please cover the main features and call paths introduced here.
  2. Relocate the overall PR plan to the parent issue, and keep this PR's description scoped to the code changes it actually contains.
  3. Document any architectural changes. If this PR modifies the architecture, please describe what changed and where, so reviewers can follow the design intent.

Thanks, and looking forward to the updates!

Copy link
Copy Markdown
Contributor

@Ma77Ball Ma77Ball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review and resolve the comments and ask any questions as needed.

@QueryParam("search") search: String
): Response = {
try {
val hfToken = Option(System.getenv("HF_TOKEN")).getOrElse("")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the user add their token to the system? Is there a future PR to allow the user to specify a token in settings or the operator itself?

): java.util.List[java.util.Map[String, Object]] = {
val allResults = new java.util.ArrayList[java.util.Map[String, Object]]()
var nextUrl: String = null
var pageCount = 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_PAGES does not exist in this PR. So what is pageCount used for, and is it needed in this pr?

Comment on lines +379 to +382
if (hfResponse.getStatus != 200) {
// Stop paginating on error, return what we have so far
return allResults
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly add a message or log entry indicating that an error occurred, rather than caching and returning an incomplete list.

@Consumes(Array(MediaType.WILDCARD))
def uploadAudioReference(
@QueryParam("filename") filename: String,
bytes: Array[Byte]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a size limit to prevent users from posting overly large audio files that will use up all the RAM. Please either implement a hard limit or improve how we handle large audio files so we don't store them in memory before writing to disk.

if (idx >= 0 && idx < safeFileName.length - 1) safeFileName.substring(idx) else ".bin"
}

val tempDir = Paths.get(System.getProperty("java.io.tmpdir"), "texera-hf-audio")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a way to clean up this folder when no longer needed.


import com.fasterxml.jackson.core.`type`.TypeReference
import com.fasterxml.jackson.databind.{JsonNode, ObjectMapper}
import kong.unirest.Unirest
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used throughout the file and should include some configuration settings, as the default settings might not be optimal.

Comment on lines +65 to +73
val cached = modelCache.get(task)
if (cached != null) {
return Response.ok(cached).build()
}

// Not cached — fetch all pages from HF Hub API
val allModels = fetchAllModelsForTask(task, hfToken)
val json = objectMapper.writeValueAsString(allModels)
modelCache.put(task, json)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The model cache needs cleanup logic, eviction policy, and a size limit. The current design only reads and puts the models in cache.

.entity("""{"error":"Media URL is required."}""")
.build()
}
if (!trimmedUrl.startsWith("http://") && !trimmedUrl.startsWith("https://")) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint will fetch any URL it is given, allowing an attacker to reach internal services. An allowlist should be implemented to avoid this issue.

.status(Response.Status.INTERNAL_SERVER_ERROR)
.entity(s"""{"error":"Failed to fetch models: ${e.getMessage}"}""")
.build()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current design returns the exact exception message, which exposes internal details to users. Suggestions:

  1. Have Jackson handle escaping instead of concatenating the strings
import scala.jdk.CollectionConverters._

private def errorJson(message: String): String = {
  objectMapper.writeValueAsString(Map("error" -> message).asJava)
}
  1. Don't expose the e.getMessage to users. return a generic message (do this for all the try catch statements)
} catch {
  case e: Exception =>
    logger.error("Model fetch failed", e)
    Response
      .status(Response.Status.INTERNAL_SERVER_ERROR)
      .entity(errorJson("Failed to fetch models."))
      .build()
}

.queryString("pipeline_tag", task)
.queryString("sort", "downloads")
.queryString("direction", "-1")
.queryString("limit", "100")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a pagination loop or a way to let users know that they are viewing a truncated list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants