Skip to content

fedora 44 +cpu/gpu with flux 2 error report #2279

Description

@alexdsh

Before you proceed, please:

  1. Read the Wiki to see if your question is answered - https://github.com/LostRuins/koboldcpp/wiki
  2. Search for past issues if someone else has already found a solution

Describe the Issue

I tried running the flux2klein model you posted. It seems to work, but strangely. Since I'm running it on a laptop with an Intel GPU, I have to use Vulkan. I currently have 16 GB of RAM, and Windows 10 has 8 GB of video buffer (UMA) available. However, trying to run the model at 10241024 resolution after 15 steps gives the error "connection to browser lost," even though the browser is open. I tried twice, but I couldn't complete the generation process either way. Even at 512512 resolution in img2img mode, I still get an out-of-memory issue, even though offload is enabled. Is it possible to split the qwen4b language model with flux2 so that one runs on the CPU (qwen) and the other (flux2) on the GPU? I read about the --autoswap switch. Will this help me avoid running out of video buffer? Theoretically, the language model won't occupy video buffer addresses when running on the CPU. Will 8 GB of flux2_4B be enough for 10241024 and for img2img at least 7681024? Also, from user experience (latest version): attempting to run the full pipeline (llm+mmproj+sd) on the GPU causes an error (it always complains about an incorrect mmproj). However, if I select a purely CPU backend, everything starts fine, recognizes images, then generates a prompt and draws to the SD card. I constantly get an error when launching on the GPU. That's why I'm asking for separate model launching to be added to the launcher interface. Another bug has also surfaced: if I save a session to a slot and then restore it, the model simply loses its vision function. I add an image, and a message appears in the console that slice is embedded. It seems mmproj honestly adds everything, but the language model itself doesn't see it. If possible, it would be nice to fix this. A little more about Flux2 speed on the GPU. I'm getting over 150 seconds per step at 1024x1024. On the CPU, it's generally over 230 seconds per step. The GPU (Intel Iris XE 80cu) isn't much faster than the CPU (Core i5-1240P). Is this a fundamental limitation, or are there optimizations possible?
Additional Information:
core5-1240p/16gB/512nvme,last app version

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions