This changes the model for llama.cpp inclusion so we're not applying a patch,
but instead have the C++ code directly in the ollama tree, which should make it
easier to refine and update over time.
By default builds will now produce non-debug and non-verbose binaries.
To enable verbose logs in llama.cpp and debug symbols in the
native code, set `CGO_CFLAGS=-g`
The windows native setup still needs some more work, but this gets it building
again and if you set the PATH properly, you can run the resulting exe on a cuda system.