ollama/examples/python-json-datagenerator
Matt Williams 5a85070c22
Update readmes, requirements, packagejsons, etc for all examples ()
Most of the examples needed updates of Readmes to show how to run them. Some of the requirements.txt files had extra content that wasn't needed, or missing altogether. Apparently some folks like to run npm start
to run typescript, so a script was added to all typescript examples which
hadn't been done before.

Basically just a lot of cleanup.

Signed-off-by: Matt Williams <m@technovangelist.com>
2023-12-22 09:10:41 -08:00
..
predefinedschema.py Update examples/python-json-datagenerator/predefinedschema.py 2023-11-15 18:23:36 -05:00
randomaddresses.py Update randomaddresses.py 2023-11-15 18:24:50 -05:00
readme.md Update readmes, requirements, packagejsons, etc for all examples () 2023-12-22 09:10:41 -08:00
requirements.txt add example showing use of JSON format 2023-11-10 16:33:56 -06:00

JSON Output Example

llmjson 2023-11-10 15_31_31

There are two python scripts in this example. randomaddresses.py generates random addresses from different countries. predefinedschema.py sets a template for the model to fill in.

Running the Example

  1. Ensure you have the llama2 model installed:

    ollama pull llama2
    
  2. Install the Python Requirements.

    pip install -r requirements.txt
    
  3. Run the Random Addresses example:

    python randomaddresses.py
    
  4. Run the Predefined Schema example:

    python predefinedschema.py
    

Review the Code

Both programs are basically the same, with a different prompt for each, demonstrating two different ideas. The key part of getting JSON out of a model is to state in the prompt or system prompt that it should respond using JSON, and specifying the format as json in the data body.

prompt = f"generate one realistically believable sample data set of a persons first name, last name, address in {country}, and  phone number. Do not use common names. Respond using JSON. Key names should with no backslashes, values should use plain ascii with no special characters."

data = {
    "prompt": prompt,
    "model": model,
    "format": "json",
    "stream": False,
    "options": {"temperature": 2.5, "top_p": 0.99, "top_k": 100},
}

When running randomaddresses.py you will see that the schema changes and adapts to the chosen country.

In predefinedschema.py, a template has been specified in the prompt as well. It's been defined as JSON and then dumped into the prompt string to make it easier to work with.

Both examples turn streaming off so that we end up with the completed JSON all at once. We need to convert the response.text to JSON so that when we output it as a string we can set the indent spacing to make the output easy to read.

response = requests.post("http://localhost:11434/api/generate", json=data, stream=False)
json_data = json.loads(response.text)

print(json.dumps(json.loads(json_data["response"]), indent=2))