Hugging Face x LangChain : A brand new partner package

We’re thrilled to announce the launch of langchain_huggingface, a partner package in LangChain jointly maintained by Hugging Face and LangChain. This latest Python package is designed to bring the ability of the newest development of Hugging Face into LangChain and stick with it up to now.

All Hugging Face-related classes in LangChain were coded by the community, and while we thrived on this, over time, a few of them became deprecated due to lack of an insider’s perspective.

By becoming a partner package, we aim to cut back the time it takes to bring latest features available within the Hugging Face ecosystem to LangChain’s users.

langchain-huggingface integrates seamlessly with LangChain, providing an efficient and effective strategy to utilize Hugging Face models throughout the LangChain ecosystem. This partnership is just not nearly sharing technology but in addition a few joint commitment to keep up and continually improve this integration.

Getting Began

Getting began with langchain-huggingface is easy. Here’s how you may install and start using the package:

pip install langchain-huggingface

Now that the package is installed, let’s have a tour of what’s inside !

The LLMs

HuggingFacePipeline

Amongst transformers, the Pipeline is probably the most versatile tool within the Hugging Face toolbox. LangChain being designed primarily to handle RAG and Agent use cases, the scope of the pipeline here is reduced to the next text-centric tasks: “text-generation", “text2text-generation", “summarization”, “translation”.

Models may be loaded directly with the from_model_id method:

from langchain_huggingface import HuggingFacePipeline

llm = HuggingFacePipeline.from_model_id(
    model_id="microsoft/Phi-3-mini-4k-instruct",
    task="text-generation",
    pipeline_kwargs={
        "max_new_tokens": 100,
        "top_k": 50,
        "temperature": 0.1,
    },
)
llm.invoke("Hugging Face is")

Or you too can define the pipeline yourself before passing it to the category:

from transformers import AutoModelForCausalLM, AutoTokenizer,pipeline

model_id = "microsoft/Phi-3-mini-4k-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    load_in_4bit=True,
    
)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=100, top_k=50, temperature=0.1)
llm = HuggingFacePipeline(pipeline=pipe)
llm.invoke("Hugging Face is")

When using this class, the model will probably be loaded in cache and use your computer’s hardware; thus, it’s possible you’ll be limited by the available resources in your computer.

HuggingFaceEndpoint

There are also two ways to make use of this class. You’ll be able to specify the model with the repo_id parameter. Those endpoints use the serverless API, which is especially helpful to people using pro accounts or enterprise hub. Still, regular users can have already got access to a good amount of request by connecting with their HF token within the environment where they’re executing the code.

from langchain_huggingface import HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    repo_id="meta-llama/Meta-Llama-3-8B-Instruct",
    task="text-generation",
    max_new_tokens=100,
    do_sample=False,
)
llm.invoke("Hugging Face is")

llm = HuggingFaceEndpoint(
    endpoint_url="",
    task="text-generation",
    max_new_tokens=1024,
    do_sample=False,
)
llm.invoke("Hugging Face is")

Under the hood, this class uses the InferenceClient to have the ability to serve a wide range of use-case, serverless API to deployed TGI instances.

ChatHuggingFace

Every model has its own special tokens with which it really works best. Without those tokens added to your prompt, your model will greatly underperform

When going from an inventory of messages to a completion prompt, there’s an attribute that exists in most LLM tokenizers called chat_template.

To learn more about chat_template in the various models, visit this space I made!

This class is wrapper around the opposite LLMs. It takes as input an inventory of messages an then creates the right completion prompt through the use of the tokenizer.apply_chat_template method.

from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    endpoint_url="",
    task="text-generation",
    max_new_tokens=1024,
    do_sample=False,
)
llm_engine_hf = ChatHuggingFace(llm=llm)
llm_engine_hf.invoke("Hugging Face is")

The code above is similar to :


llm.invoke("[INST] Hugging Face is [/INST]")


llm.invoke("""<|begin_of_text|><|start_header_id|>user<|end_header_id|>Hugging Face is<|eot_id|><|start_header_id|>assistant<|end_header_id|>""")

The Embeddings

Hugging Face is stuffed with very powerful embedding models than you may directly leverage in your pipeline.

First select your model. One good resource for selecting an embedding model is the MTEB leaderboard.

HuggingFaceEmbeddings

This class uses sentence-transformers embeddings. It computes the embedding locally, hence using your computer resources.

from langchain_huggingface.embeddings import HuggingFaceEmbeddings

model_name = "mixedbread-ai/mxbai-embed-large-v1"
hf_embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
)
texts = ["Hello, world!", "How are you?"]
hf_embeddings.embed_documents(texts)

HuggingFaceEndpointEmbeddings

HuggingFaceEndpointEmbeddings may be very much like what HuggingFaceEndpoint does for the LLM, within the sense that it also uses the InferenceClient under the hood to compute the embeddings.
It will probably be used with models on the hub, and TEI instances whether or not they are deployed locally or online.

from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings

hf_embeddings = HuggingFaceEndpointEmbeddings(
    model= "mixedbread-ai/mxbai-embed-large-v1",
    task="feature-extraction",
    huggingfacehub_api_token="",
)
texts = ["Hello, world!", "How are you?"]
hf_embeddings.embed_documents(texts)

Conclusion

We’re committed to creating langchain-huggingface higher by the day. We will probably be actively monitoring feedback and issues and dealing to handle them as quickly as possible. We will even be adding latest features and functionality and expanding the package to support a good wider range of the community’s use cases. We strongly encourage you to do that package and to provide your opinion, as it’ll pave the way in which for the package’s future.

Source link

Hugging Face x LangChain : A brand new partner package

Getting Began

The LLMs

HuggingFacePipeline

HuggingFaceEndpoint

ChatHuggingFace

The Embeddings

HuggingFaceEmbeddings

HuggingFaceEndpointEmbeddings

Conclusion

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Study: AI chatbots provide less-accurate information to vulnerable users

Millisecond Latency using Hugging Face Infinity and modern CPUs

The Missing Curriculum: Essential Concepts For Data Scientists within the Age of AI Coding Agents

Welcome Stable-baselines3 to the Hugging Face Hub 🤗

Exposing biases, moods, personalities, and abstract concepts hidden in large language models

Hugging Face x LangChain : A brand new partner package

Getting Began

The LLMs

HuggingFacePipeline

HuggingFaceEndpoint

ChatHuggingFace

The Embeddings

HuggingFaceEmbeddings

HuggingFaceEndpointEmbeddings

Conclusion

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.