Page 65 - 3-son 2018 yil
P. 65

Хорижий филология  №3, 2018 йил


                     MORPHOLOGICAL ANALYSIS BY FINITE STATE TRANSDUCER FOR
                                   UZBEK-ENGLISH MACHINE TRANSLATION

                                               Abdurakhmonova Nilufar,
                   Tashkent state university of Uzbek language and literature named after Alisher Navoi
                                                    Tuliyev Ulugbek,
                               National University of Uzbekistan named after Mirzo Ulugbek

                   Key  words:  morphological  rules,  morphophonological  rules,  automatic  morphological
            analyser, machine translation.

                   I.      Introduction                         gan  +  lig  +  I  +  ni,  Compound  verb:  mashq
                   Machine  translation  is  the  process  of   qil+dir+ish+ayot+gan+lar,  verbal  compound:
            interaction  between  human  and  computer.  It     ber+dir+tir+ib
            depends     on   not    only   computational        yubor+ma+yot+gan+dan+mi+kan+a  and  so
            technology  but  also  interdisciplinary  of        on.
            sciences    which     belonging     to    for              I.I.  Morphotactic  opportunity  in
            understanding    text.   Therefore,   if   the      Uzbek language.
            translation is for English and Uzbek, there are            Here  morphotactics  also  plays  main
            different  structures  and  peculiarities make to   role  for  morphological  parsing.  After
            study     morphological     aspects    before       morphological  parsing,  the  components  of
            translation stage.                                  text  are  analyzed  semantical  approach.
                   Over  the  last  30  years,  numerous        Consequently  all  legal  and  illegal  positions
            researches  have  been  carried  out  to  create    morphemes  are  considered  in  spotlight.  In
            technologies  for  computational  morphology.       Uzbek  morphotactics  of  words  are  such  as
            Morphological analyzer for Turkic languages         order  position:  (1)  prefix  (2)  root  +  (3)
            proceeded  in  the  beginning  of  60s-years  in    derivative  affix  +  (4)  lexical  affix+(5)
            20th    century   [1].   Morphoanalyzer    is       grammatical       affix    ((1)ham(2)qishloq
            necessary  for  machine  translation  to  divide    (3)lik(4)lar(5)imiz(5)dan).  In  English  (1)
            components  of  the  words  and  identify  the      Prefix+  (2)  rооt  +  (3)  lехicаl  suffiх  +  (4)
            grammatical  paradigms  of  target  language.       grаmmаticаl  suffiх  ((1)co(2)work(3)er(4)s).
            Uzbek  language  is  one  of  agglutinative         However the model is like each other  Uzbek
            languages  and  English  is  inflection  one.       grammatical  affixes  match  preposition  and
            Therefore, there are a lot of morphemes like        adverb in English.
            these  languages.  A  morpheme  is  small                  The      most      sub-problem      of
            meaningful  unit  of  lexeme.  It  has  two         morphological recognition emerged in Turkic
            components  as  stem  and  affix.  Stem  gives      languages for machine translation.  Because a
            main  sense  for  lexeme  and  affix  add           morphological  dictionary  is  a  database,  in
            grammatical  or  semantical  meaning  to  the       which linguistic information could be stored.
            word.  There  are  many  ways  to  combine                 Some  times  to  identify  model  of
            morphemes  to  create  words.  Four  of  these      morphotactic  knowledge  of  words  is  a  bit
            methods  are  common  and  play  important          problematic    task   if   morphemes      are
            roles  in  speech  and  language  processing:       compoundable:        yog‟ingarchilik     and
            inflection,  derivation,  compounding,  and         zargarchilk, paxtachilik. First word cannot be
            cliticization  [2].  In  Uzbek  the  number  of     broken  into  parts,  because  there  is  not
            possible inflectional affixes is rather big than    yog‟in+garchilik,  but  as  a  job  there  is
            other  non-Turkic  languages.  Because  nearly      zar+gar  used  separately  from  +chilik,
            all parts of speech could be in inflected form      paxta+chi+lik. As a result, it is three forms of
            in               context:              Noun:        morphemes:  garchilik,  gar+chilik,  chi+lik.
            bola+jon+lar+im+dagi+lar+niki+mas+mi+ka             Therefore we length of string as morpheme in
            n+a; Simple verb: o‗qi+t + tir + ma  + yot +        Uzbek. We assume that there is nine letter of



                                                            64
   60   61   62   63   64   65   66   67   68   69   70