* Unified arm/x86 windows installer
This adjusts the installer payloads to be architecture aware so we can cary
both amd64 and arm64 binaries in the installer, and install only the applicable
architecture at install time.
* Include arm64 in official windows build
* Harden schedule test for slow windows timers
This test seems to be a bit flaky on windows, so give it more time to converge
The rocm CI step for RCs was incorrectly tagging them as the latest rocm build.
The multiarch manifest was incorrectly tagged twice (with and without the
prefix "v"). Static windows artifacts weren't being carried between build
jobs. This also fixes the latest tagging script.
* Optimize container images for startup
This change adjusts how to handle runner payloads to support
container builds where we keep them extracted in the filesystem.
This makes it easier to optimize the cpu/cuda vs cpu/rocm images for
size, and should result in faster startup times for container images.
* Refactor payload logic and add buildx support for faster builds
* Move payloads around
* Review comments
* Converge to buildx based helper scripts
* Use docker buildx action for release
If there are any pending reponses (such as from potential stop
tokens) then we should send them back before ending the sequence.
Otherwise, we can be missing tokens at the end of a response.
Fixes#6707
This adds back a check which was lost many releases back to verify /dev/kfd permissions
which when lacking, can lead to confusing failure modes of:
"rocBLAS error: Could not initialize Tensile host: No devices found"
This implementation does not hard fail the serve command but instead will fall back to CPU
with an error log. In the future we can include this in the GPU discovery UX to show
detected but unsupported devices we discovered.
Provide a mechanism for users to set aside an amount of VRAM on each GPU
to make room for other applications they want to start after Ollama, or workaround
memory prediction bugs
* Update gpu.md
Seems strange that the laptop versions of 3050 and 3050 Ti would be supported but not the non-notebook, but this is what the page (https://developer.nvidia.com/cuda-gpus) says.
Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>
* Update gpu.md
Remove notebook reference
---------
Signed-off-by: bean5 <2052646+bean5@users.noreply.github.com>