Skip to content

Optimization suggestions #60

@sl33pyC01E

Description

@sl33pyC01E

Have you considered Tiny Auto Encoder for Hunyuan, Wan variant? It's a direct drop in for the vae you use that takes up a fraction of the memory and latency bandwidth. I successfully subbed it in myself to great success.

and

Did you know there are GGUF quantization options for Wan based models? I'm testing compatibility now, but I see no reason why a Q4 quant wouldn't run.

The combination of both would likely bring real time inference down to 4090 scale and trajectory rollout to laptop scale.

Just thoughts, I appreciate the project regardless.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions