..
amd_common.go
Request and model concurrency
2024-04-22 19:29:12 -07:00
amd_hip_windows.go
Request and model concurrency
2024-04-22 19:29:12 -07:00
amd_linux.go
AMD gfx patch rev is hex
2024-04-24 09:43:52 -07:00
amd_windows.go
AMD gfx patch rev is hex
2024-04-24 09:43:52 -07:00
assets.go
Centralize server config handling
2024-05-05 16:49:50 -07:00
cpu_common.go
Mechanical switch from log to slog
2024-01-18 14:12:57 -08:00
cuda_common.go
Request and model concurrency
2024-04-22 19:29:12 -07:00
gpu.go
Centralize server config handling
2024-05-05 16:49:50 -07:00
gpu_darwin.go
gpu: add 512MiB to darwin minimum, metal doesn't have partial offloading overhead ( #4068 )
2024-05-01 11:46:03 -04:00
gpu_info.h
Request and model concurrency
2024-04-22 19:29:12 -07:00
gpu_info_cpu.c
Request and model concurrency
2024-04-22 19:29:12 -07:00
gpu_info_cudart.c
Request and model concurrency
2024-04-22 19:29:12 -07:00
gpu_info_cudart.h
Request and model concurrency
2024-04-22 19:29:12 -07:00
gpu_info_darwin.h
darwin: no partial offloading if required memory greater than system
2024-04-16 11:22:38 -07:00
gpu_info_darwin.m
darwin: no partial offloading if required memory greater than system
2024-04-16 11:22:38 -07:00
gpu_test.go
Request and model concurrency
2024-04-22 19:29:12 -07:00
types.go
Request and model concurrency
2024-04-22 19:29:12 -07:00