Weaviate Retrieval Model
Weaviate is an open-source vector database that can be used to retrieve relevant passages before passing it to the language model. Weaviate supports a variety of embedding models from OpenAI, Cohere, Google and more! Before building your DSPy program, you will need a Weaviate cluster running with data. You can follow this notebook as an example.
Configuring the Weaviate Client
Weaviate is available via a hosted service (WCD) or as a self managed instance. You can learn about the different installation methods here.
weaviate_collection_name
(str): The name of the Weaviate collectionweaviate_client
(WeaviateClient): An instance of the Weaviate clientk
(int, optional): The number of top passages to retrieve. The default is set to3
An example of the WeaviateRM constructor:
WeaviateRM(
collection_name: str
weaviate_client: str,
k: int = 5
)
Under the Hood
forward(self, query_or_queries: Union[str, List[str]], k: Optional[int] = None, **kwargs) -> dspy.Prediction
Parameters
query_or_queries
(Union[str, List[str]]): The query or queries to search fork (Optional[int]): The number of top passages to retrieve. It defaults to
self.k`**kwargs
: Additional keyword arguments likererank
for example
Returns
dspy.Prediction
: An object containing the retrieved passages
Sending Retrieval Requests via the WeaviateRM Client
Here is an example of the Weaviate constructor using embedded:
import weaviate
import dspy
from dspy.retrieve.weaviate_rm import WeaviateRM
weaviate_client = weaviate.connect_to_embedded() # you can also use local or WCD
retriever_model = WeaviateRM(
collection_name="<WEAVIATE_COLLECTION>",
weaviate_client=weaviate_client
)
results = retriever_model("Explore the significance of quantum computing", k=5)
for result in results:
print("Document:", result.long_text, "\n")
You can follow along with more DSPy and Weaviate examples here!