Over 10 mio. titler Fri fragt ved køb over 499,- Hurtig levering 30 dages retur
Dissimilarities Detections in Arabic and English Texts

Dissimilarities Detections in Arabic and English Texts

- Using n-grams, Histograms and Self Organizing Maps

Bog
  • Format
  • Bog, paperback
  • Engelsk
  • 128 sider

Beskrivelse

The main goals of our research is to apply mathematical methods to cover anomalies and discrepancies in texts. English and Arabic texts were analyzed from many statistical characteristics point of view. We covered some basic statistical differences between lengths of used words in both languages and the results were applied in some heuristics for measurements of text parts dissimilarities. In the research we prepared three methods for the analysis of texts: (1) Element n-gram profiles method: The method is based on similarity/dissimilarity occurrences of n-grams in text parts in a comparison to a full text. (2) Histogram method: Histograms of text sequences are analyzed from a cluster point of view. If a cluster dispersion is not large, the text is probably written by the same author. If the cluster dispersion is large, the text is critical and it will be split in two or more parts and the same analysis will be done for the text parts. (3) Neural networks { Systems of Self-Organizing Maps: The systems were trained to input sequences and after the training they determine text parts with anomalies using a cumulative error and some complex analysis.

Læs hele beskrivelsen
Detaljer
  • SprogEngelsk
  • Sidetal128
  • Udgivelsesdato05-01-2018
  • ISBN139786202302715
  • Forlag Scholars Press
  • FormatPaperback
Størrelse og vægt
  • Vægt209 g
  • Dybde0,8 cm
  • coffee cup img
    10 cm
    book img
    15 cm
    22 cm

    Machine Name: SAXO082