Crowdworks “Self-constructed high-quality data has definite market value”

-

(Photo = Crowdworks)

Currently, scale AI is gaining momentum in the US. A month ago, this company attracted $1 billion value of investment at a worth of $13.8 billion (roughly 19 trillion won) from big tech firms akin to Amazon, Meta, Cisco, Intel, AMD, and NVIDIA. Inside five years of firm, it has grown right into a ‘Decacorn’ company with a company value of over $10 billion.

Scale AI focuses on data preprocessing for artificial intelligence (AI) learning, that’s, it’s a frontrunner in data labeling. The introduction of AI in firms is in full swing, and it’s one of the popular fields.

The importance of information is self-explanatory. Particularly, the world’s best performing model has been replaced almost every month recently, and open source models are also closely following closed ones akin to ‘GPT-4’, ‘Claude’, and ‘Gemini’. In other words, with model upward standardization, it’s now said that AI performance depends upon data.

In Korea, Crowdworks (CEO Kim Woo-seung) is in an analogous situation. Established in 2017, the corporate has led the domestic AI data field for about 7 years and recorded sales of 23.9 billion won, the very best ever, after successfully being listed on the KOSDAQ last yr.

The domestic industry has also seen a pointy increase within the introduction of AI since this yr, and inquiries about constructing datasets have also been rapidly increasing. A Crowdworks official said, “Recently, the domestic industry has also been unanimously emphasizing the importance of information.”

At the top of April, the corporate released WorksOne, a small language model (sLM) specialized for domestic corporate businesses, and announced that it had built its own data. Since then, inquiries about purchasing the dataset have been pouring in.

Particularly, in Korea, field-specific small language models (sLM) are dominant. Subsequently, the tendency to prefer high-quality data is more pronounced. “I felt that self-constructed high-quality data definitely had market value,” he said.

Crowdworks’ dataset is a high-quality Korean dataset built ‘directly’ by skilled staff, not a straightforward machine translation. It has 10,000 datasets based on languages ​​ceaselessly utilized in corporate business environments.

The reason is that expert data professionals were deployed to investigate the info characteristics required for every business, akin to finance, distribution, and public institutions, and to construct a dataset that reflects ceaselessly used business terms and expressions.

Moreover, it has been announced that it would expand its scope to a full-stack service, including data collection and processing for AI model learning, data construction for fine-tuning, in addition to customized model development, model evaluation, and verification.

(Photo = Crowdworks)
(Photo = Crowdworks)

We’re expanding our goal beyond the domestic market to overseas. “Along with European exhibitions akin to participating in ‘Vivatech’ in Paris last month, we’ll actively expand our overseas business by participating within the US AI exhibition scheduled to be held in August,” he said. Vivatech reported that it confirmed the extent of interest within the European market, with Crown Prince Guillaume of Luxembourg visiting the Korean company’s booth.

He continued, “It’s difficult to say that the European market is our primary goal, especially due to strict regulations, but we imagine that we are going to give you the chance to sufficiently meet the demand with our solid technological capabilities,” showing confidence.

Meanwhile, Crowdworks recently signed an MOU with Lenovo to speed up the event of corporate-tailored LLMs utilizing high-performance computing infrastructure.

Reporter Jang Se-min semim99@aitimes.com

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x