‘’Mbeɛd aɛqal n taɣect, nebɣa ad nesiweḍ ad d-iɣer uselkim taqbaylit i yiman-is’’

Partager

Deg tdiwennit-a, newwi-d awal akked Mesṭafa Kamal ɣef usenfar-is n taggara n tezrawt ‘’Aɛqal n taɣect yettwabnan ɣef Deep Learning i tutlayt taqbaylit’’. Amek i d-yella ufran-is, allalen yeseqdec deg-s akked temsal-nniḍen. 

Aɣmis n yimaziɣen : Anwa i d Mesṭafa Kamal ?

Mestafa Kamal : Nekk d ajenyuṛ seg Uɣerbaz aɣelnaw Unnig n Tsenselkimt (ESI) n Lezzayer. Fukkeɣ deg waggur-agi n yunyu asenfaṛ-iw n taggara n tezrawt s uzwel : « Aɛqal n taɣect yettwabnan ɣef Deep Learning i tutlayt taqbaylit ». Deep Learning, izmer ad t-id-nini s teqbaylit Almad Alqayan. D yiwet n tetiknikt tamaynut i neseqdac ass-a deg uḥric n tigzi n tmacint (Intelligence artificielle).

Amek i ak-d-tusa tikti n usenfar-agi ?

Ddeqs aya segmi ttekkaɣ deg yisenfaṛen akk n usideg (localisation) akked usemḍen (numérisation) n tutlayt taqbaylit. Ama d isenfaṛen n tesbeddit Mozilla am Common Voice, ama d Tatoeba neɣ d asideg n tɣerɣrin timiḍanin (plateformes numériques). Imi lliɣ d anelmad n tsenselkimt, tteɛṛaḍeɣ ad d-awiɣ ayen i wumi zemreɣ, yal mi ara stafiɣ. Seg wannect-n, gziɣ acu i d azal i zemren wallalen akk n tsenselkimt ad t-id-awin i usebded d usnerni n tutlayt, abeɛda tutlayt-nneɣ taqbaylit. Tɛedda tallit i deg d tira i d allal amezwaru i yis ara tesnerniḍ tutlayt. Ass-a, imdanen ur qqaṛen ara annect i iwehha wallaɣ-nsen ɣer tilifun neɣ Internet. Ɣef waya, anekcum n teqbaylit ɣer umaḍal amiḍan, yufrar-d gar ttawilat akk i yis nezmer ad nesbedd tutlayt-nneɣ. Asmi i d-iwweḍ lawan ad ferneɣ asenfaṛ n taggara n tezrawt, nwiɣ ad xedmeɣ deg wayen akk icudden ɣer yisenfaṛen-agi n usideg yettwabdan yakan. Uriɣ-as i Mass Muḥend Belqasem, d ajenyuṛ iqeddcen ayen dinn deg unnar n usideg. Nemlal, nemmeslay acḥal d abrid. Nmuqel anwa asenfar i izemren ad ittwaxdem akka tura, dɣa nefren-d wagi : « Aɛqal n taɣect yettwabnan ɣef deep learning i tutlayt teqbaylit ».

 Ma yella wamek ad d-tesegziḍ (s telqayt) leqdic-agi-inek i wakken ad t-fehmen yimeɣriyen ?

Deg leqdic-iw, suliɣ anagraw n uɛqal n taɣect, neɣ akken-nniḍen, anagraw n uɛqal awurman (automatique) n yimesli. Aya, d ayen yellan si zik deg waṭas n tutlayin, am teglizit neɣ tafṛansist. Tikkelt-a, nebɣa ad  d-yili ula i teqbaylit. D allal i ara yesiwḍen aselkim, neɣ tilifun, mi ara as-thedṛeḍ s teqbaylit, ad d-yaru ayen akken i as-tenniḍ, d tifyar neɣ d aḍris, s uqadeṛ n yilugan n tira. S telqay, aɛqal n yimesli ittɛeddi si snat n tiram tigejdanin.  Tamezwarut, d asiffi n tesraḍ. Deg-s ad d-nesuffeɣ imesliyen si lhedṛa. I wannect-a, sqedceɣ tarrayt i wumi qqaṛen MFCC, tettwassen ɣer wid ixeddmen tasleḍt n yimesli (analyse du son). Tarrayt-a, tezmer ad teɛqel imesliyen am wakken i ten-itteɛqal umeẓẓuɣ n umdan. Syin, tiremt tis snat d asurti n yimesliyen-nni. Ittusemma, ilaq ad neqqen gar yimesliyen d yisekkilen n teqbaylit. D annect-n dɣa i ixeddem Deep Learning, neɣ s teqbaylit Tigzi talqayant. D tarrayt i yeseqdacen azeṭṭa n yinuṛunen i wakken ad ig assaɣen gar yimesliyen akked yisekkilen n tutlayt. Ɣer taggara, ad nesɛu tafyirt i iwulmen n wayen akken i as-nehder i uselkim. D acu kan, i wakken ad nexdem Deep Learning, ilaq ad nesɛu tagrumma n yisefka (dataset) meqqret ayen dinn, i wakken ad ilmed uselkim akken iwata. Nekk sqedceɣ ugar n 25 000 n tefyar s teqbaylit. D azal n 260 n yisragen n lhedṛa. Annect-a akk, wwiɣ-t-id si tɣerɣert n Common Voice.

D acu n yiswi agejdan n waya ?

I umdan menwala, aɛqal n yimesli, izmer ad ittuseqdac deg waṭas n tɣula. Nezmer ad d-nebder :

  • Anadi s yimesli di Internet n deg uselkim ;
  • Tira s yimesli, tzemreḍ ad taruḍ aḍris war ma tesqedceḍ anasiw, ittusemma s wawal kan ;
  • Asendeh s yimesli : Nezmer ad nger anagraw-agi i tɣawsiwin n yal ass, am tkeṛṛust neɣ imsismeḍ, i wakken ma nebɣa ad ixdem kra, ad nesendeh kan, war ma ntekka di tqeffal.

Anwi i ak-yefkan afus n tallelt deg usenfar-agi-inek ?

Asenfaṛ-agi mačči d ayen isehlen i wakken ad t-isali umdan weḥd-s. I yi-d-ifkan afus n tallelt deg wayen akk i xedmeɣ d Mass Ɛebdelkrim Aries, d aselmad-iw deG uɣerbaz ; Mass Muḥend Belqasem i d-bedreɣ yakan, d ajenyuṛ iqeddcen aṭas deg usideg n yisenfaṛen n tsenselkimt s teqbaylit  akked Mass Alexandre Lissy, d ajenyur n tesbeddit Mozilla i yi-d-imlan aṭas n temsal deg wamek akk ara siwḍeɣ asenfaṛ-iw ɣer taggara-s.

 D acu n wuguren i d-temmugreḍ deg unnar ?

Deg unnar, ur d-mmugreɣ ara uguren, ala allalen isenselkamen (matériel informatiques). Deg waya, d tasbeddit Mozilla i yi-d-imudden allalen isenselkamen i yis ara selḥuɣ almad n yisefka n Common Voice.

 D acu tettraǧuḍ seg yiqbayliyen s umata ?

I wakken ad d-yili usenfar-iw neɣ ad yaweḍ ɣer taggara-s, nniɣ-d d akken sqedceɣ tagrumma n yisefka n yimesli i d-wwiɣ si Common Voice. Common-Voice-agi, d tiɣerɣert n tesbeddit Mozilla i deg imdanen, anda ma llan, zemren ad skelsen s taɣect-nsen tifyar i wakken ad bnun tagrumma n yisefka. Ddeqs n yiseggasen aya segmi tekcem teqbaylit ɣer Common Voice. Asmi bdiɣ asenfaṛ, yuɣ lḥal llant 260 n yisragen n lhedṛa. Akka tura, nesaweḍ nnig 500 yisragen. Ilaq ad nzeṛ d akken, simmal nesɛa tagrumma meqqṛen n yisefka, simmal anagraw n uɛqal n yimesli ara d-nesuffeɣ itteɛqal ugar n yimeslayen. Ɣef wannect-a, ad d-necdeɣ iqbayliyen anda ma llan, i wakken ad kkin deg usenfar n Common Voice s teqbaylit, i wakken ad nesiweḍ sya ar aseggas d-iteddun 1 200 n yisragen n lhedṛa. Ass-n, anagraw-nneɣ ad itteɛqal ugar n 90 % n wayen ara d as-tiniḍ. Ssarameɣ ad taweḍ tiɣṛi-w i yal aqbayli i wakken ad iddu ɣer Common Voice, ad isekles taɣect-is. Wid issnen ilugan n tira s teqbaylin, ilaq ad ddun ɣer Sentence Collector i wakken ad d-mudden tifyar ara d-ittunefken di Common Voice.

 D acu-ten yisenfaren-nniḍen i tesɛiḍ sya d afella ?

Akka tura, imi nfukk aɛqal n yimesli, ilaq ad t-neg d asmel i wakken ad izmiren yimdanen ad t-sqedcen. Neɣ daɣen, ad t-id-nesuffeɣ d asnas n tilifun (application mobile) i wakken ad ittuseqdec i tilifunat.  Sya ɣer zdat, nessaram ad nebdu asenfaṛ-nniḍen i wumi qqaṛen TTS (Speech to Text). D asenfaṛ i yis ara nesiweḍ aselkim ad yezmir, ad d-iɣeṛ taqbaylit i yiman-is.

Awal n tagara :

Tanemmirt i kunwi i yi-ifkan tagnit ad d-mmeslayeɣ ɣef wayen xedmeɣ. Ad iniɣ tanemmirt i yiwiziwen akk n Common Voice, acku limmer mačči d nutni ur d-ittili ara usenfaṛ-agi. Ad iniɣ tanemmirt i Muḥend Belqasem d Alexandre Lissy ɣef ufus n tallelt d usebɣes-nsen.  Ssarameɣ ad kemmleɣ akka deg yisenfaren-agi akk icudden ɣer teqbaylit. Am wakken ssarameɣ ad d-rnun ɣer-neɣ yinelmaden ijenyuṛen n tsenselkimt i wakken tutlayt-nneɣ, ad taẓ ɣer zdat.

Yesteqsa-t Hocine Moula

Partager