LangChain vs. Transformers

Create a Step by step guide for what is being discussed
Title: “🤗 Hugging Face Transformers Agent | LangChain comparisons”
Transcript: “hugging face just released the Transformers agent this week basically to use large language models to connect and execute Transformer models hosted on hanging face if you’re familiar with Lane chain agents Transformers agent looks really familiar it has a lot of similarities with line chain agent so in this video we’re going to take a look at how Transformers agent works and how it Compares with Lane chain agent so this figure describes how Transformers agent works you have a set of tools like image generator model image captioner model text to speech model you can see all the model details here for example text to speech is using speech T5 model that’s hosted a hugging face now we give the Transformer agent an instruction readout loud the content of the image you are feeding to a prompt template where your job is to come up with a series of python commands that will perform the task and you can use the following tools the agent which is a large language model it will try to understand the prompt and now determine okay now I will use the image captioner to caption the image and use the text to speech to read down loud and the output of this large language model is the python syntax because in the prompt we ask the language model to come up with a series commands in Python that’s why the output is python code this python code get feeding to python interpreter to execute and generate the output let’s take a look at how exactly it works in code let’s just open up this collab first up we need to install needed packages which is hugging face hub because a lot of the models are hosted on hanging this Hub and then we need to log in using our hugging face Hub API which is Free by the way but you will need to set up API key we have three options for agents you can choose star coder open assistant or open AI you can see the models here the Open System model is using pythia model right now pythia 12b and then for openai models you will need a openai API key and you will need to set up billing by the way this is not free but if you want to choose the other two options those are completely free but open AI will give you better results so we’re gonna go with openai there are two ways to use the agent one is Agent wrong one is Agent chat agent run doesn’t keep any of the chat histories agent chat keeps chat histories so here’s an example where we ask the agent to generate an image of a boat in the water and then here you can see the step the first step is to understand the ask and decide okay I will use the following tool image generator to generate an image according to the prompt and now the next step is to generate the code as you can see here we’re downloading model files to actually execute this model as a result we can see we have a boat generated the next example can you caption the boat image where the boat image is the boat image we just generated with the variable boat because previously we defined this image as boat now our language model decided to use another tool image captioner to generate a caption for the image and the python code is this thing and the answer is about floating in the water another example can you generate an image of a boat please read out loud the content of the image afterwards so this is like chaining different models together for this one we use the image generator to generate the image and then we use image captioner to make a caption for the image and finally we use the text reader to generate the audio based on the captioning of the image uh even though as you can see here the explanation from the agent only mentioned image generator and text reader it didn’t mention the image captioner but the code it actually generated included image captioner that’s kind of interesting that it was able to generate the code even without the reasoning of needing this code a wooden boat floating in the water a wooden boat floating on the water which is pretty good this next example is to read out loud the summary of The Hanging face page now we can see it’s using the text downloader to download the test of the web page and then we use the summarizer function to create a summary of the text and then we use the text reader to generate the audio of the text hugging face is an eye community building the future as I mentioned earlier the first way to use the agent is to do agent.ron the second way is to do agent.chat agent.chat keeps memory across different runs so here’s an example we get an image of a capybara using the image generator function and then we can ask the agent to transform the image so that it snows now we’re using the image Transformer tool to transform the image which is the image Transformer function here yeah you can see the image was changed we can ask another question show me a mask of the snowy cabbaras it it was able to figure out that you needed the image segmenter to create a segment patient mask for the image and it’s using the image segmenter function if you want to clear all the history and start from scratch just do agent dot prepare for new chat and it will clear out everything so as a reminder Transformer agents was able to use all of the tools listed here for example when we create a caption of the image we’re using the blimp model when we created the audio from the text we’re using speech T5 model and so on so those are the tools that’s currently supported as you can see a lot of the models here are designed for specific use cases which makes the model more accurate and make the process easy to do for example you can do text classification using large language models with certain prompt engineering but with this approach we can simply use BART to do the text classification and it’s easier you don’t need to do any prompt engineering for specific tasks anymore you just call the model that does a specific task and also those tools gives more capabilities to the large language model that they can do them along for example speech T5 allow us to convert text to speech in large and language model can’t do that so the last section about this document is creating new tools which is pretty straightforward I think they are using the cat as a service API which I didn’t know before to get random cats we created a new class cat image fetcher which depends on the superclass tool and then in this class we just need to make the dunder call method to open the image from the API we call the tool is actually giving us a random cat image to use the tool in agent we Define the tool in the list of the additional tools parameter and then we run the agent the agent will be able to use the tool CAD fetcher to fetch the image and then use the image captioner to caption it so as you can see here we get a cat equals catfish Patcher and image captioner to get that caption and now the result is a cat sitting on the top of the table um in the next section I want to talk about how is it different or similar from line chain agents first of all let’s talk about tools here are the tools available for Transformer agents you can see majority of them are models and there are some community-based tools for specific use cases if we look at line chain tools however and most of them are external apis it’s not models so if you want to do Google Search right you can use large language models to connect with your Google search API and do Google search there are some other options you can also execute python in the python wrapper tool oh yeah I want to talk about this one so launching already has hanging face tools which means it can also load hacking face models so I think Lane chain can already do a lot of things that Transformer agents can do take a look at the code this is a directory of tools for the Transformer agents each tool is a standalone python file and if we take a look at what’s inside it’s just a class of of the tool oh the speech to text is actually using the open AI whisper model that’s interesting you can easily contribute to this directory and add a tool similarly in line chain the tools are in this utilities directory for example this is the python wrapper tool you can see there’s a few lines of code for you to execute python code I think it should be free straightforward for either of them to use each other’s tools so so that’s tools secondly I want to talk about agent or maybe prompt to be more accurate is how do agent choose which model to use for this specific task here is the prompt that Transformer agents are using your job is to come up with a series of commands in Python that will perform the task to help you I will give you access to a set of tools that you can use each tools of python function and has description explaining the task it performs the input it expects and the output it returns you should first explain which tool you will use to perform the task and for what reason and write the coding python and then the prompt gives it a set of tools that we looked at and then it’s like providing few short learning it gives a few examples where the large language model to learn from so we have a task the explanation of what tool to use is the translator tool and the answer is to use the translator function and also the image QA function uh and so on so give a few of the answers and finally the user defined prompt is added here this is pretty straightforward just give you a list of tools and give you some examples to follow so for nature however it’s a little bit different there are four different agent types each use a different kinds of framework and tools the default is to use a react framework to determine which tool to use and this is based on the react paper react is short for reasoning and acting meaning that it does not only give an action for what to do it also gives some thoughts I guess it’s similar to the explanation that Transformer agents are using yeah I guess it’s similar in essence and finally one of the most different things between transform agents the enlightening agents are the purpose is different transform agents seems to have a natural scope right now because it’s only for come up with python commands and executing python code whereas for Lane chain it’s not just limited to execute Python and of course it can’t execute python but it can also do a lot of other things Lane chain has a much bigger scope for doing pretty much everything yeah so again Transformer agents are still experimental I am looking forward to see what’s going to happen next thank you for watching bye”

Step-by-Step Guide: Hugging Face Transformers Agent | LangChain Comparisons

Here is a step-by-step guide summarizing the discussion on Hugging Face Transformers Agent and comparing it to LangChain agents:

1. Introduction:
– Hugging Face recently released the Transformers agent, which connects and executes Transformer models hosted on Hugging Face.
– The Transformers agent shares similarities with LangChain agents.
– The guide aims to explain how the Transformers agent works and how it compares to LangChain agents.

2. Understanding the Transformers Agent:
– The Transformers agent utilizes a set of tools, such as image generator models, image captioner models, and text-to-speech models.
– Each tool performs a specific task, and the agent determines which tools to use based on the given prompt.
– The output of the agent is Python code, which is generated based on the prompt’s instructions.

3. Setting Up the Environment:
– Install the required packages, primarily “huggingface_hub,” to access the models hosted on Hugging Face.
– Log in using the Hugging Face Hub API key. Note that the API key is free, but it requires setup.

4. Choosing the Agent Type:
– The Transformers agent offers three options: StarCoder, OpenAssistant, and OpenAI.
– Each option utilizes different models and frameworks, with OpenAI providing better results.
– Select the desired agent type based on your requirements.

5. Using the Agent:
– The Transformers agent provides two main ways to interact: “agent.run” and “agent.chat.”
– “agent.run” does not keep chat histories, while “agent.chat” retains memory across different runs.
– Examples are provided to demonstrate the usage of the agent with different prompts and instructions.

6. Clearing Chat History:
– If you want to clear the chat history and start anew, use the “agent.prepare_for_new_chat” function.

7. Supported Tools:
– The Transformers agent supports various tools, including image generators, image captioners, text-to-speech models, etc.
– Each tool is designed for specific use cases, enhancing accuracy and simplifying the process.
– Prompt engineering is not required for specific tasks; instead, models dedicated to those tasks can be called directly.

8. Creating New Tools:
– The guide introduces creating new tools using the “Cat as a Service” API as an example.
– Define a new class for the tool, inherit from the superclass “Tool,” and implement the required methods.
– Include the new tool in the list of additional tools for the agent to utilize it.

9. Comparison with LangChain Agents:
– Discusses the differences and similarities between Transformers agents and LangChain agents.
– Transformers agents primarily use models as tools, while LangChain agents integrate external APIs for various tasks.
– Both frameworks can potentially use each other’s tools, as they operate with Python code.

10. Purpose and Scope:
– Transformers agents have a more specific scope, focusing on generating Python commands and executing Python code.
– LangChain agents have a broader scope and can perform various tasks beyond Python execution.

11. Conclusion:
– Transformers agents are still experimental, and future developments are awaited.
– LangChain agents offer more versatility and capabilities beyond Python execution.

This step-by-step guide provides an overview of the discussed topics related to Hugging Face Transformers Agent and LangChain agents, allowing readers to understand the key points covered in the transcript.
Extract only the installation steps and and create an install file where you can
Installation Steps for Hugging Face Transformers Agent:

1. Install the required packages:
– Make sure you have the “huggingface_hub” package installed.
– Use the package manager of your choice (e.g., pip) to install it:
“`
pip install huggingface_hub
“`

2. Set up Hugging Face Hub API:
– To access the models hosted on Hugging Face, you need to log in using the Hugging Face Hub API key.
– Obtain your API key by signing up on the Hugging Face website.
– Once you have the API key, set it up by following the instructions provided by Hugging Face.

3. Choose the Agent Type:
– Hugging Face Transformers Agent offers three options: StarCoder, OpenAssistant, and OpenAI.
– Each agent type utilizes different models and frameworks.
– Consider your requirements and choose the appropriate agent type.

4. Additional Notes:
– Keep in mind that while the Hugging Face Hub API key is free, you may need to set up billing for certain options, such as OpenAI models.
– Make sure to consult the official Hugging Face documentation for detailed information on using the Transformers Agent.

Please note that these installation steps assume you have a Python environment set up.