Skip to main content




pip install cohere
  • Cohere api_key (for non-cached examples)

Setting up the Cohere Client

The constructor initializes the base class LM to support prompting requests to Cohere models. This requires the following parameters:


  • model (str): Cohere pretrained models. Defaults to command-xlarge-nightly.
  • api_key (Optional[str], optional): API provider provider authentication token. Defaults to None. This then internally initializes the cohere.Client.
  • stop_sequences (List[str], optional): List of stopping tokens to end generation.
  • max_num_generations internally set: Maximum number of completions generations by Cohere client. Defaults to 5.

Example of the Cohere constructor:

class Cohere(LM):
def __init__(
model: str = "command-xlarge-nightly",
api_key: Optional[str] = None,
stop_sequences: List[str] = [],

Under the Hood

__call__(self, prompt: str, only_completed: bool = True, return_sorted: bool = False, **kwargs) -> List[Dict[str, Any]]


  • prompt (str): Prompt to send to Cohere.
  • only_completed (bool, optional): Flag to return only completed responses and ignore completion due to length. Defaults to True.
  • return_sorted (bool, optional): Flag to sort the completion choices using the returned averaged log-probabilities. Defaults to False.
  • **kwargs: Additional keyword arguments for completion request.


  • List[str]: List of generated completions.

Internally, the method handles the specifics of preparing the request prompt and corresponding payload to obtain the response.

The method calculates the number of iterations required to generate the specified number of completions n based on the self.max_num_generations that the Cohere model can produce in a single request. As it completes the iterations, it updates the official num_generations argument passed to the request payload and calls request with the updated arguments.

This process iteratively constructs a choices list from which the generated completions are retrieved.

If return_sorted is set and more than one generation is requested, the completions are sorted by their likelihood scores in descending order and returned as a list with the most likely completion appearing first.

Using the Cohere client

cohere = dsp.Cohere(model='command-xlarge-nightly')

Sending Requests via Cohere Client

  1. Recommended Configure default LM using dspy.configure.

This allows you to define programs in DSPy and simply call modules on your input fields, having DSPy internally call the prompt on the configured LM.


#Example DSPy CoT QA program
qa = dspy.ChainOfThought('question -> answer')

response = qa(question="What is the capital of Paris?") #Prompted to cohere
  1. Generate responses using the client directly.
response = cohere(prompt='What is the capital of Paris?')

Written By: Arnav Singhvi