ollama/examples/python-json-datagenerator
Matt Williams b6817a83d8 Add gif and finish readme
Signed-off-by: Matt Williams <m@technovangelist.com>
2023-11-10 16:41:48 -06:00
..
predefinedschema.py add example showing use of JSON format 2023-11-10 16:33:56 -06:00
randomaddresses.py Add gif and finish readme 2023-11-10 16:41:48 -06:00
readme.md Add gif and finish readme 2023-11-10 16:41:48 -06:00
requirements.txt add example showing use of JSON format 2023-11-10 16:33:56 -06:00

JSON Output Example

llmjson 2023-11-10 15_31_31

There are two python scripts in this example. randomaddresses.py generates random addresses from different countries. predefinedschema.py sets a template for the model to fill in.

Review the Code

Both programs are basically the same, with a different prompt for each, demonstrating two different ideas. The key part of getting JSON out of a model is to state in the prompt or system prompt that it should respond using JSON, and specifying the format as json in the data body.

prompt = f"generate one realisticly believable sample data set of a persons first name, last name, address in {country}, and  phone number. Do not use common names. Respond using JSON. Key names should with no backslashes, values should use plain ascii with no special characters."

data = {
    "prompt": prompt,
    "model": model,
    "format": "json",
    "stream": False,
    "options": {"temperature": 2.5, "top_p": 0.99, "top_k": 100},
}

When running randomaddresses.py you will see that the schema changes and adapts to the chosen country.

In predefinedschema.py, a template has been specified in the prompt as well. It's been defined as JSON and then dumped into the prompt string to make it easier to work with.

Both examples turn streaming off so that we end up with the completed JSON all at once. We need to convert the response.text to JSON so that when we output it as a string we can set the indent spacing to make the output attractive.

response = requests.post("http://localhost:11434/api/generate", json=data, stream=False)
json_data = json.loads(response.text)

print(json.dumps(json.loads(json_data["response"]), indent=2))