diff --git a/README.md b/README.md
index 7813c96..f5cd50e 100644
--- a/README.md
+++ b/README.md
@@ -216,6 +216,59 @@ Note that `chat_format` option must be set for the particular model you are usin
 
 Chat completion is available through the [`create_chat_completion`](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#llama_cpp.Llama.create_chat_completion) method of the [`Llama`](https://llama-cpp-python.readthedocs.io/en/latest/api-reference/#llama_cpp.Llama) class.
 
+### JSON and JSON Schema Mode
+
+If you want to constrain chat responses to only valid JSON or a specific JSON Schema you can use the `response_format` argument to the `create_chat_completion` method.
+
+#### Json Mode
+
+The following example will constrain the response to be valid JSON.
+
+```python
+>>> from llama_cpp import Llama
+>>> llm = Llama(model_path="path/to/model.gguf", chat_format="chatml")
+>>> llm.create_chat_completion(
+    messages=[
+        {
+            "role": "system",
+            "content": "You are a helpful assistant that outputs in JSON.",
+        },
+        {"role": "user", "content": "Who won the world series in 2020"},
+    ],
+    response_format={
+        "type": "json_object",
+    },
+    temperature=0.7,
+)
+```
+
+#### Json Mode
+
+To constrain the response to a specific JSON Schema, you can use the `schema` property of the `response_format` argument.
+
+```python
+>>> from llama_cpp import Llama
+>>> llm = Llama(model_path="path/to/model.gguf", chat_format="chatml")
+>>> llm.create_chat_completion(
+    messages=[
+        {
+            "role": "system",
+            "content": "You are a helpful assistant that outputs in JSON.",
+        },
+        {"role": "user", "content": "Who won the world series in 2020"},
+    ],
+    response_format={
+        "type": "json_object",
+        "schema": {
+            "type": "object",
+            "properties": {"team_name": {"type": "string"}},
+            "required": ["team_name"],
+        },
+    },
+    temperature=0.7,
+)
+```
+
 ### Function Calling
 
 The high-level API also provides a simple interface for function calling.