Skip to main content

Typed Predictors

In DSPy Signatures, we have InputField and OutputField that define the nature of inputs and outputs of the field. However, the inputs and output to these fields are always str-typed, which requires input and output string processing.

Pydantic BaseModel is a great way to enforce type constraints on the fields, but it is not directly compatible with the dspy.Signature. Typed Predictors resolves this as a way to enforce the type constraints on the inputs and outputs of the fields in a dspy.Signature.

Executing Typed Predictors

Using Typed Predictors is not too different than any other module with the minor additions of type hints to signature attributes and using a special Predictor module instead of dspy.Predict. Let's take a look at a simple example to understand this.

Defining Input and Output Models

Let's take a simple task as an example i.e. given the context and query, the LLM should return an answer and confidence_score. Let's define our Input and Output models via pydantic.

from pydantic import BaseModel, Field

class Input(BaseModel):
context: str = Field(description="The context for the question")
query: str = Field(description="The question to be answered")

class Output(BaseModel):
answer: str = Field(description="The answer for the question")
confidence: float = Field(ge=0, le=1, description="The confidence score for the answer")

As you can see, we can describe the attributes by defining a simple Signature that takes in the input and returns the output.

Creating Typed Predictor

A Typed Predictor needs a Typed Signature, which extends a dspy.Signature with the addition of specifying "field type".

class QASignature(dspy.Signature):
"""Answer the question based on the context and query provided, and on the scale of 10 tell how confident you are about the answer."""

input: Input = dspy.InputField()
output: Output = dspy.OutputField()

Now that we have the QASignature, let's define a Typed Predictor that executes this Signature while conforming to the type constraints.

predictor = dspy.TypedPredictor(QASignature)

Similar to other modules, we pass the QASignature to dspy.TypedPredictor which enforces the typed constraints.

And similarly to dspy.Predict, we can also use a "string signature", which we type as:

predictor = dspy.TypedPredictor("input:Input -> output:Output")

I/O in Typed Predictors

Now let's test out the Typed Predictor by providing some sample input to the predictor and verifying the output type. We can create an Input instance and pass it to the predictor to get a dictionary of the output.

doc_query_pair = Input(
context="The quick brown fox jumps over the lazy dog",
query="What does the fox jumps over?",
)

prediction = predictor(input=doc_query_pair)

Let's see the output and its type.

answer = prediction.output.answer
confidence_score = prediction.output.confidence

print(f"Prediction: {prediction}\n\n")
print(f"Answer: {answer}, Answer Type: {type(answer)}")
print(f"Confidence Score: {confidence_score}, Confidence Score Type: {type(confidence_score)}")

Typed Chain of Thoughts with dspy.TypedChainOfThought

Extending the analogous comparison of TypedPredictor to dspy.Predict, we create TypedChainOfThought, the typed counterpart of dspy.ChainOfThought:

cot_predictor = dspy.TypedChainOfThought(QASignature)

doc_query_pair = Input(
context="The quick brown fox jumps over the lazy dog",
query="What does the fox jumps over?",
)

prediction = cot_predictor(input=doc_query_pair)

Typed Predictors as Decorators

While the dspy.TypedPredictor and dspy.TypedChainOfThought provide a convenient way to use typed predictors, you can also use them as decorators to enforce type constraints on the inputs and outputs of the function. This relies on the internal definitions of the Signature class and its function arguments, outputs, and docstrings.

@dspy.predictor
def answer(doc_query_pair: Input) -> Output:
"""Answer the question based on the context and query provided, and on the scale of 0-1 tell how confident you are about the answer."""
pass

@dspy.cot
def answer(doc_query_pair: Input) -> Output:
"""Answer the question based on the context and query provided, and on the scale of 0-1 tell how confident you are about the answer."""
pass

prediction = answer(doc_query_pair=doc_query_pair)

Composing Functional Typed Predictors in dspy.Module

If you're creating DSPy pipelines via dspy.Module, then you can simply use Functional Typed Predictors by creating these class methods and using them as decorators. Here is an example of using functional typed predictors to create a SimplifiedBaleen pipeline:

class SimplifiedBaleen(FunctionalModule):
def __init__(self, passages_per_hop=3, max_hops=1):
super().__init__()
self.retrieve = dspy.Retrieve(k=passages_per_hop)
self.max_hops = max_hops

@cot
def generate_query(self, context: list[str], question) -> str:
"""Write a simple search query that will help answer a complex question."""
pass

@cot
def generate_answer(self, context: list[str], question) -> str:
"""Answer questions with short factoid answers."""
pass

def forward(self, question):
context = []

for _ in range(self.max_hops):
query = self.generate_query(context=context, question=question)
passages = self.retrieve(query).passages
context = deduplicate(context + passages)

answer = self.generate_answer(context=context, question=question)
return dspy.Prediction(context=context, answer=answer)

Optimizing Typed Predictors

Typed predictors can be optimized on the Signature instructions through the optimize_signature optimizer. Here is an example of this optimization on the QASignature:

import dspy
from dspy.evaluate import Evaluate
from dspy.evaluate.metrics import answer_exact_match
from dspy.teleprompt.signature_opt_typed import optimize_signature

turbo = dspy.OpenAI(model='gpt-3.5-turbo', max_tokens=4000)
gpt4 = dspy.OpenAI(model='gpt-4', max_tokens=4000)
dspy.settings.configure(lm=turbo)

evaluator = Evaluate(devset=devset, metric=answer_exact_match, num_threads=10, display_progress=True)

result = optimize_signature(
student=dspy.TypedPredictor(QASignature),
evaluator=evaluator,
initial_prompts=6,
n_iterations=100,
max_examples=30,
verbose=True,
prompt_model=gpt4,
)