This can be a guest blog post by the XLSCOUT team.
XLSCOUT, a Toronto-based leader in the usage of AI in mental property (IP), has developed a robust proprietary embedding model called ParaEmbed 2.0 stemming from an ambitious collaboration with Hugging Face’s Expert Support Program. The collaboration focuses on applying state-of-the-art AI technologies and open-source models to boost the understanding and evaluation of complex patent documents including patent-specific terminology, context, and relationships. This enables XLSCOUT’s products to supply one of the best performance for drafting patent applications, patent invalidation searches, and ensuring ideas are novel in comparison with previously available patents and literature.
By fine-tuning on high-quality, multi-domain patent data curated by human experts, ParaEmbed 2.0 boasts a remarkable 23% increase in accuracy in comparison with its predecessor, ParaEmbed 1.0, which was released in October 2023. With this advancement, ParaEmbed 2.0 is now in a position to accurately capture context and map patents against prior art, ideas, products, or standards with even greater precision.
The journey towards enhanced patent evaluation
Initially, XLSCOUT explored proprietary AI models for patent evaluation, but found that these closed-source models, similar to GPT-4 and text-embedding-ada-002, struggled to capture the nuanced context required for technical and specialized patent claims.
By integrating open-source models like BGE-base-v1.5, Llama 2 70B, Falcon 40B, and Mixtral 8x7B, and fine-tuning on proprietary patent data with guidance from Hugging Face, XLSCOUT achieved more tailored and performant solutions. This shift allowed for a more accurate understanding of intricate technical concepts and terminologies, revolutionizing the evaluation and understanding of technical documents and patents.
Collaborating with Hugging Face via the Expert Support Program
The collaboration with Hugging Face has been instrumental in enhancing the standard and performance of XLSCOUT’s solutions. Here’s an in depth overview of how this partnership has evolved and its impact:
- Initial development and testing: XLSCOUT initially built and tested a custom TorchServe inference server on Google Cloud Platform (GCP) with Distributed Data Parallel (DDP) for serving multiple replicas. By integrating ONNX optimizations, they achieved a performance rate of roughly ~300 embeddings per second.
- Enhanced model performance via fine-tuning: Positive-tuning of an embedding model was performed using data curated by patent experts. This workflow not only enabled more precise and contextually relevant embeddings, but additionally significantly improved the performance metrics, ensuring higher accuracy in detecting relevant prior art.
- High throughput serving: By leveraging Hugging Face’s Inference Endpoints with built-in load balancing, XLSCOUT now serves embedding models with Text Embedding Inference (TEI) for a high throughput use case running successfully in production. The answer now achieves impressive performance, delivering ~2700 embeddings per second!
- LLM prompting and inference: The collaboration has included efforts around LLM prompt engineering and inference, which enhanced the model’s ability to generate accurate and context-specific patent drafts. Prompt engineering was employed for patent drafting use cases, ensuring that the prompts resulted in coherent, comprehensive, and legally-sound patent documents.
- Positive-tuning LLMs with instruction data: Instruction data formatting and fine-tuning were implemented using models from Meta and Mistral. This fine-tuning allowed for much more precise and detailed generation of some parts of the patent drafting process, further improving the standard of the generated output.
The partnership with Hugging Face has been a game-changer for XLSCOUT, significantly improving the processing speed, accuracy, and overall quality of their LLM-driven solutions. This collaboration ensures that universities, law firms, and other clients profit from cutting-edge AI technologies, driving efficiency and innovation within the patent landscape.
XLSCOUT’s AI-based IP Solutions
XLSCOUT provides state-of-the-art AI-driven solutions that significantly enhance the efficiency and accuracy of patent-related processes. Their solutions are widely leveraged by corporations, universities, and law firms to streamline various facets of IP workflows, from novelty searches and invalidation studies to patent drafting.

- Novelty Checker LLM: Leverages cutting-edge LLMs and Generative AI to swiftly navigate through patent and non-patent literature to validate your ideas. It delivers a comprehensive list of ranked prior art references alongside a key feature evaluation report. This tool enables inventors, researchers, and patent professionals to be certain that inventions are novel by comparing them against the extensive corpus of existing literature and patents.
- Invalidator LLM: Utilizes advanced LLMs and Generative AI to conduct patent invalidation searches with exceptional speed and accuracy. It provides an in depth list of ranked prior art references and a key feature evaluation report. This service is crucial for law firms and corporations to efficiently challenge and assess the validity of patents.
- Drafting LLM: Is an automatic patent application drafting platform harnessing the ability of LLMs and Generative AI. It generates precise and high-quality preliminary patent drafts, encompassing comprehensive claims, abstracts, drawings, backgrounds, and descriptions inside a number of minutes. This solution aids patent practitioners in significantly reducing the effort and time required to provide detailed and precise patent applications.
Corporations and universities profit by ensuring that novel research outputs are appropriately protected, encouraging innovation, and filing prime quality patents. Law firms utilize XLSCOUT’s solutions to deliver superior service to their clients, improving the standard of their patent prosecution and litigation efforts.
A partnership for innovation
“We’re thrilled to collaborate with Hugging Face”, said Mr. Sandeep Agarwal, CEO of XLSCOUT. “This partnership combines the unparalleled capabilities of Hugging Face’s open-source models, tools, and team with our deep expertise in patents. By fine-tuning these models with our proprietary data, we’re poised to revolutionize how patents are drafted, analyzed, and licensed.”
The joint efforts of XLSCOUT and Hugging Face involve training open-source models on XLSCOUT’s extensive patent data collection. This synergy harnesses the specialized knowledge of XLSCOUT and the advanced AI capabilities of Hugging Face, leading to models uniquely optimized for patent research. Users will profit from more informed decisions and priceless insights derived from complex patent documents.
Commitment to innovation and future plans
As pioneers in the applying of AI to mental property, XLSCOUT is devoted to exploring recent frontiers in AI-driven innovation. This collaboration marks a big step towards bridging the gap between cutting-edge AI and real-world applications in IP evaluation.
Together, XLSCOUT and Hugging Face are setting recent standards in patent evaluation, driving innovation, and shaping the long run of mental property. We’re excited to proceed this awesome journey together!
To learn more about Hugging Face’s Expert Support Program in your company, please get in contact with us here – our team will contact you to debate your requirements!
