feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy#5124
feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy#5124PG1204 wants to merge 1 commit into
Conversation
…d media proxy Introduces a new Jersey REST resource exposing endpoints used by the upcoming HuggingFace operator UI: - GET /api/huggingface/models — browse / search models per task - GET /api/huggingface/tasks — list HF pipeline tags with hosted inference - POST /api/huggingface/upload-audio — upload audio for HF audio tasks - GET /api/huggingface/audio-preview — stream uploaded audio (path-validated) - GET /api/huggingface/media-proxy — proxy remote media URLs to bypass CORS This is the first PR in a stacked series landing the HF operator end-to-end. No operator code yet; this resource is independently useful and lets the frontend integrate with HF before the operator class lands.
|
/request-review @Ma77Ball |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5124 +/- ##
============================================
- Coverage 43.17% 42.95% -0.23%
- Complexity 2209 2212 +3
============================================
Files 1045 1046 +1
Lines 40254 40469 +215
Branches 4250 4288 +38
============================================
+ Hits 17380 17383 +3
- Misses 21804 22016 +212
Partials 1070 1070
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@PG1204 Thanks for opening this PR! Please do the following:
|
|
Thank you for the suggestions. Will update the PR accordingly. |
|
Hi @PG1204, while I begin my review, please address @Yicong-Huang's feedback. Specifically:
Thanks, and looking forward to the updates! |
Ma77Ball
left a comment
There was a problem hiding this comment.
Please review and resolve the comments and ask any questions as needed.
| @QueryParam("search") search: String | ||
| ): Response = { | ||
| try { | ||
| val hfToken = Option(System.getenv("HF_TOKEN")).getOrElse("") |
There was a problem hiding this comment.
How does the user add their token to the system? Is there a future PR to allow the user to specify a token in settings or the operator itself?
| ): java.util.List[java.util.Map[String, Object]] = { | ||
| val allResults = new java.util.ArrayList[java.util.Map[String, Object]]() | ||
| var nextUrl: String = null | ||
| var pageCount = 0 |
There was a problem hiding this comment.
MAX_PAGES does not exist in this PR. So what is pageCount used for, and is it needed in this pr?
| if (hfResponse.getStatus != 200) { | ||
| // Stop paginating on error, return what we have so far | ||
| return allResults | ||
| } |
There was a problem hiding this comment.
Possibly add a message or log entry indicating that an error occurred, rather than caching and returning an incomplete list.
| @Consumes(Array(MediaType.WILDCARD)) | ||
| def uploadAudioReference( | ||
| @QueryParam("filename") filename: String, | ||
| bytes: Array[Byte] |
There was a problem hiding this comment.
There should be a size limit to prevent users from posting overly large audio files that will use up all the RAM. Please either implement a hard limit or improve how we handle large audio files so we don't store them in memory before writing to disk.
| if (idx >= 0 && idx < safeFileName.length - 1) safeFileName.substring(idx) else ".bin" | ||
| } | ||
|
|
||
| val tempDir = Paths.get(System.getProperty("java.io.tmpdir"), "texera-hf-audio") |
There was a problem hiding this comment.
Please add a way to clean up this folder when no longer needed.
|
|
||
| import com.fasterxml.jackson.core.`type`.TypeReference | ||
| import com.fasterxml.jackson.databind.{JsonNode, ObjectMapper} | ||
| import kong.unirest.Unirest |
There was a problem hiding this comment.
This is used throughout the file and should include some configuration settings, as the default settings might not be optimal.
| val cached = modelCache.get(task) | ||
| if (cached != null) { | ||
| return Response.ok(cached).build() | ||
| } | ||
|
|
||
| // Not cached — fetch all pages from HF Hub API | ||
| val allModels = fetchAllModelsForTask(task, hfToken) | ||
| val json = objectMapper.writeValueAsString(allModels) | ||
| modelCache.put(task, json) |
There was a problem hiding this comment.
The model cache needs cleanup logic, eviction policy, and a size limit. The current design only reads and puts the models in cache.
| .entity("""{"error":"Media URL is required."}""") | ||
| .build() | ||
| } | ||
| if (!trimmedUrl.startsWith("http://") && !trimmedUrl.startsWith("https://")) { |
There was a problem hiding this comment.
The endpoint will fetch any URL it is given, allowing an attacker to reach internal services. An allowlist should be implemented to avoid this issue.
| .status(Response.Status.INTERNAL_SERVER_ERROR) | ||
| .entity(s"""{"error":"Failed to fetch models: ${e.getMessage}"}""") | ||
| .build() | ||
| } |
There was a problem hiding this comment.
The current design returns the exact exception message, which exposes internal details to users. Suggestions:
- Have Jackson handle escaping instead of concatenating the strings
import scala.jdk.CollectionConverters._
private def errorJson(message: String): String = {
objectMapper.writeValueAsString(Map("error" -> message).asJava)
}
- Don't expose the e.getMessage to users. return a generic message (do this for all the try catch statements)
} catch {
case e: Exception =>
logger.error("Model fetch failed", e)
Response
.status(Response.Status.INTERNAL_SERVER_ERROR)
.entity(errorJson("Failed to fetch models."))
.build()
}
| .queryString("pipeline_tag", task) | ||
| .queryString("sort", "downloads") | ||
| .queryString("direction", "-1") | ||
| .queryString("limit", "100") |
There was a problem hiding this comment.
There should be a pagination loop or a way to let users know that they are viewing a truncated list.
Summary
Related to issue #5041
First PR in a stacked series landing the HuggingFace operator end-to-end. This PR adds only the backend REST resource — no operator code yet. The resource is independently useful (the frontend can already integrate with it) and lets reviewers absorb the API surface before the operator class lands.
What's changes were proposed in this PR?
Introduces
HuggingFaceModelResource, a Jersey resource registered at/api/huggingface/*:GET /api/huggingface/modelsGET /api/huggingface/tasksPOST /api/huggingface/upload-audio/tmp/texera-hf-audio/and returns the file path.GET /api/huggingface/audio-previewGET /api/huggingface/media-proxyAlso a one-line registration in
TexeraWebApplication.scala.Stacked PR plan
This is PR 1 of ~9. Subsequent PRs will add:
HuggingFaceInferenceOpDescskeleton + text-generation codegenTest plan
sbt "amber/test"passes locallyGET /api/huggingface/tasksand confirm JSON list of supported tasksGET /api/huggingface/models?task=text-generationand confirm paginated model listPOST /api/huggingface/upload-audiowith a small WAV, then fetch via/api/huggingface/audio-previewand confirm the bytes matchGET /api/huggingface/media-proxy?url=…with a known HF inference response URL and confirm the response is streamedAuthored / co-authored using generative AI tooling?
Co-authored with Claude Opus 4.7