Hugging Face has launched a JavaScript machine learning (ML) library for the online that lets you run ‘Transformers’ directly within the browser with out a server. This enables developers to efficiently create sophisticated AI applications directly throughout the browser.
HuggingPage is a recent browser that may efficiently load and run Transformer architecture-based models. ‘Transformers.js’ v3announced that it has been released as open source.
The new edition of Transformers.js introduces a brand new quantization format to make resource-intensive Transformer-based models accessible throughout the browser. Quantization is a form of compression technique that may reduce model size and improve processing speed on resource-constrained platforms comparable to web browsers.
It supports 120 model architectures including ‘Pi-3’, ‘Gemma 2’, and ‘Lava’ within the fields of natural language processing, computer vision, audio, and multimodal, and the inference speed is as much as 100 times improved in comparison with the previous version. Moreover, 25 recent example projects and templates are included, and over 1,200 pre-converted custom models can be found.
Hugging Face collaborated with Amazon (AWS) on today to assist firms deploy and run large language models (LLM) on a wide selection of hardware. ‘HUGS(Hugging Face Generative AI Services)’announced.
This can be a service that overlaps with NIM (Nvidia’s Inference Microservices), which Nvidia is promoting, and competition is attracting attention.
HUGS, like NVIDIA NIM, is a containerized model image that incorporates what users have to deploy AI models.
As a substitute of using vLLM or TensorRT LLM to optimize LLM at scale, users can call standard open APIs using pre-configured container images from container virtualization platforms Docker or Kubernetes. You possibly can connect.
HUGS was developed based on the open source Text Generation Inference (TGI) and Transformer framework and libraries. Subsequently, it could be deployed on quite a lot of hardware platforms, including NVIDIA and AMD GPUs.
Support can even be expanded to specialized AI accelerators comparable to Amazon AI inference chip Inferentia and Google TPU.
Hugging Face offers HUGS for $1 per hour on Amazon and Google’s cloud computing services, in addition to on Digital Ocean, a specialized cloud computing company. Firms may download HUGS and run it in their very own data centers.
Reporter Park Chan cpark@aitimes.com