The windows native setup still needs some more work, but this gets it building again and if you set the PATH properly, you can run the resulting exe on a cuda system.
Run the server.cpp directly inside the Go runtime via cgo while retaining the LLM Go abstractions.