Makine öğrenmesi algoritmalarının karmaşıklık ve doygunluk analizinin bir veri kümesi üzerinde gerçekleştirilmesi

Demirhan, Tolga

Makine öğrenmesi algoritmalarının karmaşıklık ve doygunluk analizinin bir veri kümesi üzerinde gerçekleştirilmesi

dc.authorid	TR15636	en_US
dc.contributor.advisor	Uçar, Özlem
dc.contributor.author	Demirhan, Tolga
dc.date.accessioned	2017-04-05T11:34:15Z
dc.date.available	2017-04-05T11:34:15Z
dc.date.issued	2015
dc.department	Enstitüler, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Ana Bilim Dalı	en_US
dc.description	Doktora Tezi	tr
dc.description.abstract	Bu çalışmada makine öğrenmesi algoritmaları kullanılarak eğitim alanında bir veri kümesi üzerinde doygunluk ve karmaşıklık analizi gerçekleştirilmiştir. Toplanan veriler arasında anlamsız ve eksik bulunan veriler temizlenerek 570 örneğe sahip bir veri kümesi elde edilmiştir. Sontest özniteliğinde yapılan veri dönüştürme işlemi ile 21 sınıflı, 5 sınıflı ve 2 sınıflı veri kümeleri elde edilmiştir. Weka'nın sahip olduğu LWL, J48, JRIP, Part, LMT, Baggıng, Random Forest, IBK, MultiLayer Perceptron, Voted Perceptron, SMO, Naïve Bayes sınıflandırma algoritmaları veri kümeleri üzerinde çalıştırılmıştır. 21 ve 5 sınıflı sontest özniteliğine sahip veri kümelerinden elde edilen başarının rastlantısal olduğu ve veri kümelerinin dengesiz olduğu sonucuna ulaşılmıştır. 2 sınıflı sontest özniteliğine sahip veri kümesinde algoritmalar çalıştırılmış sadece Naïve Bayes ve Voted Perceptron algoritmalarında verinin örnekleme yoğunluğunun doygunluk seviyesine ulaştığı sonucu çıkarılmıştır. Veri kümelerinin karmaşıklığını belirlemek üzere IBK, SMO, Voted Perceptron, J48 ve Naïve Bayes algoritmaları 2 sınıflı sontest özniteliğine sahip veri kümesine uygulanmıştır. Karmaşıklık analizinde verinin lineer olduğu durumlarda başarılı sonuçlar veren bir algoritma olan Voted perceptron algoritması en iyi sonuçları vermiştir. Yapılan karmaşıklık deneylerinde farklı üs değerleri için algoritmanın lineerliği değiştirilmiş, üs değeri arttıkça doğru sınıflandırma oranının düşmesi kullanılan veri kümesinin lineer olduğunu göstermiştir. Sınıflandırma gücü yüksek IBK algoritması ve destek karar makineleri (SVM) ile yapılan deneylerde eğitim verisi ile aşırı uyum (overfitting) durumu ortaya çıkmıştır.	en_US
dc.description.abstract	abstract	en_US
dc.description.abstract	In this study, a complexity and saturation analysis was performed on a data cluster with a training area, using machine learning algorithms. Among the data collected, null and missing data were cleaned and a data cluster of 570 instances was formed. By transforming the data in the 'Sontest' attribute, data clusters with 21, 5 and 2 classes were acquired. LWL, J48, JRIP, Part, LMT, Bagging, Random Forest, IBK, MultiLayer Perceptron, Voted Perceptron, SMO, Naïve Bayes classification algorithms of Weka were run on the data clusters. It was concluded that the success achieved with the data clusters which have a 'sontest' attribute with 21 and 5 classes was arbitrary and the clusters were imbalanced. The algorithms were run on the data cluster with 2 classes and the analyses revealed that data sampling density reached the point of saturation only in Naïve Bayes and Voted Perceptron algorithms. In order to determine the complexity level, IBK, SMO, Voted Perceptron, J48 and Naïve Bayes algorithms were run on the data cluster which has a 'sontest' attribute with 2 classes. In the complexity analysis, the best results were acquired with the Voted perceptron algorithm, which outputs successful results with linear data. The linearity of the algorithm was altered for different exponents in the complexity analyses and the decrease in the correct classification rate with the increase in the exponent value showed that the data cluster was linear. Overfitting to training data was observed in the tests that were run with Support Vector Machines and IBK, which has a high classification power.	en_US
dc.identifier.uri	https://hdl.handle.net/20.500.14551/1830
dc.identifier.yoktezid	414126	en_US
dc.language.iso	tr	en_US
dc.publisher	Trakya Üniversitesi Fen Bilimleri Enstitüsü	en_US
dc.relation.publicationcategory	Tez	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Makina Öğrenmesi	en_US
dc.subject	Karmaşıklık ve Doygunluk Analizi	en_US
dc.subject	Sınıflandırma Algoritmaları	en_US
dc.subject	Machine Learning	en_US
dc.subject	Complexity and Saturation Analysis	en_US
dc.subject	Classification Algorithms	en_US
dc.title	Makine öğrenmesi algoritmalarının karmaşıklık ve doygunluk analizinin bir veri kümesi üzerinde gerçekleştirilmesi	en_US
dc.title.alternative	Performing a complexity and saturation analysis of machine learning algorithms on a data cluster	en_US
dc.type	Doctoral Thesis	en_US
dc.type.description		en_US

Dosyalar

Orijinal paket

Listeleniyor 1 - 1 / 1

İsim:: Başlıksız
Boyut:: 3.14 MB
Biçim:: Adobe Portable Document Format
Açıklama:: Tam Metin / Full Text

İndir

Lisans paketi

Listeleniyor 1 - 1 / 1

İsim:: license.txt
Boyut:: 1.67 KB
Biçim:: Item-specific license agreed upon to submission
Açıklama:

İndir

Koleksiyon

Fen Bilimleri Enstitüsü Tez Koleksiyonu