A. Document Frequiency (DF)Document frequency is the number of documen translation - A. Document Frequiency (DF)Document frequency is the number of documen Indonesian how to say

A. Document Frequiency (DF)Document

A. Document Frequiency (DF)
Document frequency is the number of documents in which a
term occurs in a dataset. It is the simplest criterion for term
selection and easily scales to a large dataset with linear
computation complexity. A basic assumption of this method is
that terms appear in minority documents are not important or
will not influence the clustering efficiency. It is a simple but
effective feature selection method for text categorization [9].
B. Term Contributtion (TC)
Because the simple method like DF assumes that each term
is of same importance in different documents, it is easily
biased by those common terms which have high document
frequency but uniform distribution over different classes. TC
is proposed to deal with this problem [10].
We will introduce TF.IDF (Term Frequency Inverse
Document Frequency) first [11]. TF.IDF synthetically
considers the frequency of a term in a document and the
document frequency of the term. It believes that if a term
appears in too many documents, it's too common and not
important for clustering. So Inverse Document Frequency is
considered. That is, if the frequency of a term in a document is
high and it does not appear in many documents, the term is
important. A common form of TF.IDF is



The result of text clustering is highly dependent on the
documents similarity. So the contribution of a term can be
viewed as its contribution to the documents' similarity. The
similarity between documents Di and D is computed by dot
product:




Term variance quality method is introduced by lnderjit
Dhillon, Jacob Kogan and Charles Nicholas [12]. It follows
the ideas of Salton and McGill [13]. The quality of the term t
is measured as follows:


Where n is the number of documents in which t occurs at
least once, and fij>=I,j=1,...,n.
0/5000
From: -
To: -
Results (Indonesian) 1: [Copy]
Copied!
A. Document Frequiency (DF)Document frequency is the number of documents in which aterm occurs in a dataset. It is the simplest criterion for termselection and easily scales to a large dataset with linearcomputation complexity. A basic assumption of this method isthat terms appear in minority documents are not important orwill not influence the clustering efficiency. It is a simple buteffective feature selection method for text categorization [9].B. Term Contributtion (TC)Because the simple method like DF assumes that each termis of same importance in different documents, it is easilybiased by those common terms which have high documentfrequency but uniform distribution over different classes. TCis proposed to deal with this problem [10].We will introduce TF.IDF (Term Frequency InverseDocument Frequency) first [11]. TF.IDF syntheticallyconsiders the frequency of a term in a document and thedocument frequency of the term. It believes that if a termappears in too many documents, it's too common and notimportant for clustering. So Inverse Document Frequency isconsidered. That is, if the frequency of a term in a document ishigh and it does not appear in many documents, the term isimportant. A common form of TF.IDF isThe result of text clustering is highly dependent on thedocuments similarity. So the contribution of a term can beviewed as its contribution to the documents' similarity. Thesimilarity between documents Di and D is computed by dotproduct:Term variance quality method is introduced by lnderjitDhillon, Jacob Kogan and Charles Nicholas [12]. It followsthe ideas of Salton and McGill [13]. The quality of the term tis measured as follows:Where n is the number of documents in which t occurs atleast once, and fij>=I,j=1,...,n.
Being translated, please wait..
Results (Indonesian) 2:[Copy]
Copied!
A. Dokumen Frequiency (DF)
Dokumen frekuensi adalah jumlah dokumen di mana
istilah terjadi dalam kumpulan data. Ini adalah kriteria yang paling sederhana untuk jangka
seleksi dan mudah skala untuk dataset besar dengan linear
kompleksitas perhitungan. Sebuah asumsi dasar dari metode ini adalah
bahwa istilah muncul dalam dokumen minoritas yang tidak penting atau
tidak akan mempengaruhi efisiensi clustering. Ini adalah sederhana namun
metode seleksi fitur yang efektif untuk kategorisasi teks [9].
B. Jangka contributtion (TC)
Karena metode sederhana seperti DF mengasumsikan bahwa setiap jangka
adalah sama pentingnya dalam dokumen yang berbeda, itu mudah
bias oleh istilah-istilah umum yang memiliki dokumen tinggi
frekuensi tetapi distribusi seragam atas kelas yang berbeda. TC
diusulkan untuk menangani masalah ini [10].
Kami akan memperkenalkan TF.IDF (Term Frequency Inverse
Document Frequency) pertama [11]. TF.IDF sintetis
menganggap frekuensi istilah dalam dokumen dan
frekuensi dokumen dari istilah. Ini percaya bahwa jika istilah
muncul dalam terlalu banyak dokumen, itu terlalu umum dan tidak
penting untuk clustering. Jadi Inverse Document Frequency adalah
dipertimbangkan. Artinya, jika frekuensi istilah dalam dokumen adalah
tinggi dan tidak muncul di banyak dokumen, istilah ini
penting. Bentuk umum dari TF.IDF adalah



Hasil pengelompokan teks sangat tergantung pada
kesamaan dokumen. Jadi kontribusi istilah dapat
dilihat sebagai kontribusinya terhadap kesamaan dokumen '. The
kesamaan antara dokumen Di dan D dihitung dengan dot
produk:




Term metode kualitas varians diperkenalkan oleh lnderjit
Dhillon, Jacob Kogan dan Charles Nicholas [12]. Ini mengikuti
ide-ide Salton dan McGill [13]. Kualitas t jangka
diukur sebagai berikut:


Dimana n adalah jumlah dokumen di mana t terjadi pada
setidaknya sekali, dan FIJ> = I, j = 1, ..., n.
Being translated, please wait..
 
Other languages
The translation tool support: Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bosnian, Bulgarian, Catalan, Cebuano, Chichewa, Chinese, Chinese Traditional, Corsican, Croatian, Czech, Danish, Detect language, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Frisian, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Korean, Kurdish (Kurmanji), Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar (Burmese), Nepali, Norwegian, Odia (Oriya), Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scots Gaelic, Serbian, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Tatar, Telugu, Thai, Turkish, Turkmen, Ukrainian, Urdu, Uyghur, Uzbek, Vietnamese, Welsh, Xhosa, Yiddish, Yoruba, Zulu, Language translation.

Copyright ©2024 I Love Translation. All reserved.

E-mail: