It was found that the variety of artificial intelligence (AI) training datasets provided by the AI Industry Convergence Project Group (head Kim Jun-ha) to corporations and research institutes through the ‘Data Distribution Portal’ increased significantly from 3 last yr to 66 this yr.
The info distribution portal provided by the AI Business Group was in temporary service until last February, and officially began service in mid-March.
In May of last yr, only 3 datasets, which were only 3, increased significantly to 66 in only one yr after the service preparation process, and the number of information providers is 9, including Amitek, Korea Photonics Technology Institute, and Brainnet, along with the project team. It was announced on the 2nd that it had increased.
The sorts of datasets have also diversified into nine fields: automobiles, energy, healthcare, cultural contents, science and technology, agriculture, livestock and fisheries, disaster safety, national environment, and others.
Of the datasets provided through the info distribution portal, 62 of them, excluding 4, are free. There are not any restrictions on using these data for public and research purposes. Nevertheless, when using it commercially, consultation with the info provider is required.
Amongst them, ▲CODD data set ▲MSD data set ▲photovoltaic power generation data were popular. The CODD dataset is an artificial dataset containing lidar data from multiple vehicles. The MSD dataset is a general-purpose algorithm identification dataset for medical image classification. As well as, it has various data similar to AI learning data related to violent situations and material characteristic theory data.
Reporter Hojeong Na hojeong9983@aitimes.com


