📓 🦜️🔗 LangChain Integration¶
TruLens provides TruChain, a deep integration with LangChain to allow you to inspect and evaluate the internals of your application built using LangChain. This is done through the instrumentation of key LangChain classes. To see a list of classes instrumented, see Appendix: Instrumented _LangChain_ Classes and Methods.
In addition to the default instrumentation, TruChain exposes the select_context method for evaluations that require access to retrieved context. Exposing select_context bypasses the need to know the json structure of your app ahead of time, and makes your evaluations re-usable across different apps.
Example Usage¶
To demonstrate usage, we'll create a standard RAG defined with LCEL.
First, this requires loading data into a vector store.
import bs4
from langchain.document_loaders import WebBaseLoader
loader = WebBaseLoader(
web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
bs_kwargs=dict(
parse_only=bs4.SoupStrainer(
class_=("post-content", "post-title", "post-header")
)
),
)
docs = loader.load()
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
vectorstore = FAISS.from_documents(documents, embeddings)
Then we can define the retriever chain using LCEL.
from langchain.schema import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.chat_models import ChatOpenAI
from langchain import hub
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
{"context": retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
To instrument an LLM chain, all that's required is to wrap it using TruChain.
from trulens_eval import TruChain
# instrument with TruChain
tru_recorder = TruChain(rag_chain)
To properly evaluate LLM apps we often need to point our evaluation at an internal step of our application, such as the retreived context. Doing so allows us to evaluate for metrics including context relevance and groundedness.
For LangChain applications where the BaseRetriever is used, select_context
can
be used to access the retrieved text for evaluation.
from trulens_eval.feedback.provider import OpenAI
from trulens_eval.feedback import Feedback
import numpy as np
provider = OpenAI()
context = TruChain.select_context(rag_chain)
f_context_relevance = (
Feedback(provider.context_relevance)
.on_input()
.on(context)
.aggregate(np.mean)
)
For added flexibility, the select_context method is also made available through
trulens_eval.app.App
. This allows you to switch between frameworks without
changing your context selector:
from trulens_eval.app import App
context = App.select_context(rag_chain)
You can find the full quickstart available here: LangChain Quickstart
Async Support¶
TruChain also provides async support for LangChain through the acall
method. This allows you to track and evaluate async and streaming LangChain applications.
As an example, below is an LLM chain set up with an async callback.
from langchain import LLMChain
from langchain.callbacks import AsyncIteratorCallbackHandler
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from trulens_eval import TruChain
# Set up an async callback.
callback = AsyncIteratorCallbackHandler()
# Setup a simple question/answer chain with streaming ChatOpenAI.
prompt = PromptTemplate.from_template("Honestly answer this question: {question}.")
llm = ChatOpenAI(
temperature=0.0,
streaming=True, # important
callbacks=[callback]
)
async_chain = LLMChain(llm=llm, prompt=prompt)
Once you have created the async LLM chain you can instrument it just as before.
async_tc_recorder = TruChain(async_chain)
with async_tc_recorder as recording:
await async_chain.ainvoke(input=dict(question="What is 1+2? Explain your answer."))
For more usage examples, check out the LangChain examples directory.
Appendix: Instrumented LangChain Classes and Methods¶
The modules, classes, and methods that trulens instruments can be retrieved from the appropriate Instrument subclass.
from trulens_eval.tru_chain import LangChainInstrument
LangChainInstrument().print_instrumentation()
Instrumenting other classes/methods.¶
Additional classes and methods can be instrumented by use of the
trulens_eval.instruments.Instrument
methods and decorators. Examples of
such usage can be found in the custom app used in the custom_example.ipynb
notebook which can be found in
trulens_eval/examples/expositional/end2end_apps/custom_app/custom_app.py
. More
information about these decorators can be found in the
docs/trulens_eval/tracking/instrumentation/index.ipynb
notebook.
Inspecting instrumentation¶
The specific objects (of the above classes) and methods instrumented for a
particular app can be inspected using the App.print_instrumented
as
exemplified in the next cell. Unlike Instrument.print_instrumentation
, this
function only shows what in an app was actually instrumented.
async_tc_recorder.print_instrumented()