Zero initialized active learning with spectral clustering using Hungarian method

Supervised machine learning tasks often require a large number of labeled training data to set up a model, and then prediction - for example the classification - is carried out based on this model. Nowadays tremendous amount of data is available on the web or in data warehouses, although only a port...

Teljes leírás

Elmentve itt :

Bibliográfiai részletek
Szerző:	Papp Dávid
Testületi szerző:	Conference of PhD Students in Computer Science (12.) (2020) (Szeged)
Dokumentumtípus:	Cikk
Megjelent:	University of Szeged, Institute of Informatics Szeged 2021
Sorozat:	Acta cybernetica 25 No. 2
Kulcsszavak:	Algoritmus, Programozás
Tárgyszavak:	Természettudományok Számítás- és információtudomány
doi:	10.14232/actacyb.288006
Online Access:	http://acta.bibl.u-szeged.hu/75616


LEADER	03015nab a2200241 i 4500
001	acta75616
005	20220512153417.0
008	220512s2021 hu o 0\|\| eng d
022			\|a 0324-721X
024	7		\|a 10.14232/actacyb.288006 \|2 doi
040			\|a SZTE Egyetemi Kiadványok Repozitórium \|b hun
041			\|a eng
100	1		\|a Papp Dávid
245	1	0	\|a Zero initialized active learning with spectral clustering using Hungarian method \|h [elektronikus dokumentum] / \|c Papp Dávid
260			\|a University of Szeged, Institute of Informatics \|b Szeged \|c 2021
300			\|a 401-419
490	0		\|a Acta cybernetica \|v 25 No. 2
520	3		\|a Supervised machine learning tasks often require a large number of labeled training data to set up a model, and then prediction - for example the classification - is carried out based on this model. Nowadays tremendous amount of data is available on the web or in data warehouses, although only a portion of those data is annotated and the labeling process can be tedious, expensive and time consuming. Active learning tries to overcome this problem by reducing the labeling cost through allowing the learning system to iteratively select the data from which it learns. In special case of active learning, the process starts from zero initialized scenario, where the labeled training dataset is empty, and therefore only unsupervised methods can be performed. In this paper a novel query strategy framework is presented for this problem, called Clustering Based Balanced Sampling Framework (CBBSF), which is not only select the initial labeled training dataset, but uniformly selects the items among the categories to get a balanced labeled training dataset. The framework includes an assignment technique to implicitly determine the class membership probabilities. Assignment solution is updated during CBBSF iterations, hence it simulates supervised machine learning more accurately as the process progresses. The proposed Spectral Clustering Based Sampling (SCBS) query startegy realizes the CBBSF framework, and therefore it is applicable in the special zero initialized situation. This selection approach uses ClusterGAN (Clustering using Generative Adversarial Networks) integrated in the spectral clustering algorithm and then it selects an unlabeled instance depending on the class membership probabilities. Global and local versions of SCBS were developed, furthermore, most confident and minimal entropy measures were calculated, thus four different SCBS variants were examined in total. Experimental evaluation was conducted on the MNIST dataset, and the results showed that SCBS outperforms the state-of-the-art zero initialized active learning query strategies.
650		4	\|a Természettudományok
650		4	\|a Számítás- és információtudomány
695			\|a Algoritmus, Programozás
710			\|a Conference of PhD Students in Computer Science (12.) (2020) (Szeged)
856	4	0	\|u http://acta.bibl.u-szeged.hu/75616/1/cybernetica_025_numb_002_401-419.pdf \|z Dokumentum-elérés

Zero initialized active learning with spectral clustering using Hungarian method

Hasonló tételek