Share to: share facebook share twitter share wa share telegram print page

Parallel text

The Rosetta Stone, a stele engraved with the same decree in both of the Ancient Egyptian scripts as well as Ancient Greek. Its discovery was key to deciphering the Ancient Egyptian language.

A parallel text is a text placed alongside its translation or translations.[1][2] Parallel text alignment is the identification of the corresponding sentences in both halves of the parallel text. The Loeb Classical Library and the Clay Sanskrit Library are two examples of dual-language series of texts. Reference Bibles may contain the original languages and a translation, or several translations by themselves, for ease of comparison and study; Origen's Hexapla (Greek for "sixfold") placed six versions of the Old Testament side by side. A famous example is the Rosetta Stone, whose discovery allowed the Ancient Egyptian language to begin being deciphered.

Large collections of parallel texts are called parallel corpora (see text corpus). Alignments of parallel corpora at sentence level are prerequisite for many areas of linguistic research. During translation, sentences can be split, merged, deleted, inserted or reordered by the translator. This makes alignment a non-trivial task.

Parallel texts may be used in language education.[3]

Types of parallel corpora

Parallel corpora can be classified into four main categories:[citation needed]

  • A parallel corpus contains translations of the same document in two or more languages, aligned at least at the sentence level. These tend to be rarer than less-comparable corpora.[citation needed]
  • A noisy parallel corpus contains bilingual sentences that are not perfectly aligned or have poor quality translations. Nevertheless, most of its contents are bilingual translations of a specific document.
  • A comparable corpus is built from non-sentence-aligned and untranslated bilingual documents, but the documents are topic-aligned.
  • A quasi-comparable corpus includes very heterogeneous and non-parallel bilingual documents that may or may not be topic-aligned.

Noise in corpora

Large corpora used as training sets for machine translation algorithms are usually extracted from large bodies of similar sources, such as databases of news articles written in the first and second languages describing similar events.

However, extracted fragments may be noisy, with extra elements inserted in each corpus. Extraction techniques can differentiate between bilingual elements represented in both corpora and monolingual elements represented in only one corpus in order to extract cleaner parallel fragments of bilingual elements. Comparable corpora are used to directly obtain knowledge for translation purposes. High-quality parallel data is difficult to obtain, however, especially for under-resourced languages.[4]

Bitext

In the field of translation studies a bitext is a merged document composed of both source- and target-language versions of a given text.

Bitexts are generated by a piece of software called an alignment tool, or a bitext tool, which automatically aligns the original and translated versions of the same text. The tool generally matches these two texts sentence by sentence. A collection of bitexts is called a bitext database or a bilingual corpus, and can be consulted with a search tool.

Bitexts and translation memories

Bitexts have some similarities with translation memories. The most salient difference is that a translation memory loses the original context, while a bitext retains the original sentence order. That said, some implementations of translation memory, such as Translation Memory eXchange (TMX), a standard XML format for exchanging translation memories between computer-assisted translation (CAT) programs, allow preserving the original order of sentences.

Bitexts are designed to be consulted by a human translator, not by a machine. As such, small alignment errors or minor discrepancies that would cause a translation memory to fail are of no importance.

In his original 1988 article, Harris also posited that bitext represents how translators hold their source and target texts together in their mental working memories as they progress. However, this hypothesis has not been followed up.[5]

Online bitexts and translation memories may also be called online bilingual concordances. Several are available on the public Web, including Linguée, Reverso, and Tradooit.[6][7][8]

See also

References

  1. ^ Chan, Sin-Wai (2015). Routledge Encyclopedia of Translation Technology. London: Routledge. ISBN 978-1-315-74912-9.
  2. ^ Williams, Philip; Sennrich, Rico; Post, Matt; Koehn, Philipp (2016). Syntax-based Statistical Machine Translation. Morgan & Claypool. ISBN 978-1-62705-502-4.
  3. ^ Abdallah, A. (2021). Impact of using parallel text strategy on teaching reading to intermediate II level students. International Journal on Social and Education Sciences (IJonSES), 3(1), 95-108. https://doi.org/10.46328/ijonses.48
  4. ^ Wołk, Krzysztof (2015). "Noisy-Parallel and Comparable Corpora Filtering Methodology for the Extraction of Bi-Lingual Equivalent Data at Sentence Level". Computer Science. 16 (2): 169–184. arXiv:1510.04500. Bibcode:2015arXiv151004500W. doi:10.7494/csci.2015.16.2.169. S2CID 12860633.
  5. ^ Harris, B. (March 1988). "Bi-Text, A New Concept in Translation Theory" (PDF). Language Monthly. 54: 8–10. Archived from the original (PDF) on 2018-03-02.
  6. ^ Genette, Marie (2016). How Reliable Are Online Bilingual Concordancers? An investigation of Linguee, TradooIT, WeBiText and ReversoContext and Their Reliability Through a Contrastive Analysis of Complex Prepositions from French to English (M.A. thesis). Université catholique de Louvain & Universitetet i Oslo. hdl:10852/51577.
  7. ^ "TradooIT – Concordancier bilingue".
  8. ^ Désilets, Alain; Farley, Benoît; Stojanović, Marta; Patenaude, Geneviève (2008). WeBiText: Building Large Heterogeneous Translation Memories from Parallel Web Content. Proceedings of Translating and the Computer. Vol. 30. pp. 27–28. S2CID 14586900.

External links

Parallel corpora

Documentation

Alignment tools

  1. ^ Ralf, Ralf Steinberger; Pouliquen, Bruno; Widiger, Anna; Ignat, Camelia; Erjavec, Tomaž; Tufiş, Dan; Varga, Dániel (2006). The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006). Genoa, Italy, 24–26 May 2006.

Read other articles:

2010 Indian filmDunno Y... Na Jaane KyonMovie posterDirected bySanjay SharmaWritten byKapil Sharma[1]Produced byRajkumari SatyaprakashStarringKapil SharmaYuvraaj ParasharCinematographyBasheer AliEdited bySanjay SharmaMusic byNikhil KamatProductioncompanyMovies Masti Magic StudiosRelease dates 22 September 2010 (2010-09-22) (I View Festival) 12 November 2010 (2010-11-12) (India) CountryIndiaLanguagesEnglishHindi Dunno Y ... Na Jaane Kyon (transl.̴…

Дьомін Олег Олексійович Дьомін Олег ОлексійовичНадзвичайний і Повноважний Посол України в КНР 18 липня 2013 — 17 травня 2019Президент Віктор ЯнуковичОлександр Турчинов (в.о.)Петро ПорошенкоПопередник Гамянін Василь Іванович т.п.Наступник Камишев Сергій ОлексійовичНадз

Tabula alimentaria triaianea La tabula alimentaria traianea è un'iscrizione bronzea rinvenuta nei pressi di Veleia, frazione di Lugagnano Val d'Arda, in provincia di Piacenza e conservata nel museo archeologico nazionale di Parma. Si tratta della più grande iscrizione d'epoca romana, alta 1,38 e larga 2,86 m. Indice 1 Storia 2 L'iscrizione 3 Note 4 Bibliografia 5 Voci correlate 6 Altri progetti 7 Collegamenti esterni Storia La tabula rappresenta l'istituzione degli Alimenta per la città …

Computer-animated streaming television series Kung Fu Panda: The Paws of DestinyGenre Action-adventure Comedy Based onKung Fu PandaDeveloped byElliott OwenVoices of Mick Wingert Chrissy Metz James Hong Amy Hill Haley Tju Laya DeLeon Hayes Gunnar Sizemore Makana Say ComposerLeo Birenberg[1]Country of originUnited StatesOriginal languageEnglishNo. of seasons1 (2 Parts)No. of episodes26ProductionExecutive producersMitch WatsonElliott OwenLane LuerasRunning time23 minutesProduction companies…

هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (يونيو 2020) أنطونيو فيسيانا (بالإسبانية: Antonio Veciana)‏  معلومات شخصية الميلاد 18 أكتوبر 1928  كوبا  الوفاة 18 يونيو 2020 (91 سنة) [1]  ميامي  مواطنة كوبا الولايات المت

Constituency of the Madhya Pradesh legislative assembly See also: Hatpipliya HatpipliyaConstituency No. 172 for the Madhya Pradesh Legislative AssemblyConstituency detailsCountryIndiaRegionCentral IndiaStateMadhya PradeshDistrictDewasLS constituencyDewasEstablished1977Total electors191,625 [1]ReservationNoneMember of Legislative Assembly16th Madhya Pradesh Legislative AssemblyIncumbent Manoj Choudhary PartyBharatiya Janata PartyElected year2020 Hatpipliya Assembly constituency is one of …

Bahman Salemiinjehboroun Porträt Geburtstag 15. Januar 1989 Geburtsort Gonbad-e Qabus, Iran Partner seit 2014 Rahman Raoufi Weltrangliste Position 47[1] Erfolge 2017 – WM-Teilnehmer (Stand: 8. August 2017) Bahman Salemiinjehboroun (* 15. Januar 1989 in Gonbad-e Qabus) ist ein iranischer Beachvolleyballspieler. Karriere Salemiinjehboroun kam 2013 mit Bahman Gholipoury bei der Asienmeisterschaft auf den fünften Platz. 2014 spielte er erstmals mit Rahman Raoufi und wurde Neunter der Asi…

العلاقات الأذربيجانية الغينية أذربيجان غينيا   أذربيجان   غينيا تعديل مصدري - تعديل   العلاقات الأذربيجانية الغينية هي العلاقات الثنائية التي تجمع بين أذربيجان وغينيا.[1][2][3][4][5] مقارنة بين البلدين هذه مقارنة عامة ومرجعية للدولتين: وجه المقا…

Чернігівський професійний будівельний ліцей Тип заклад професійної освітиКраїна  УкраїнаРозташування Чернігів 51°32′14″ пн. ш. 31°15′56″ сх. д. / 51.53727900002777318° пн. ш. 31.26572600002777946° сх. д. / 51.53727900002777318; 31.26572600002777946Координати: 51°32′14″ пн. ш. 31

?Arctostaphylos regismontana Arctostaphylos regismontana Біологічна класифікація Царство: Рослини (Plantae) Відділ: Покритонасінні (Magnoliophyta) Клас: Дводольні (Magnoliopsida) Порядок: Вересоцвіті (Ericales) Родина: Вересові (Ericaceae) Підродина: Arbutoideae Рід: Arctostaphylos Вид: A. regismontana Біноміальна назва Arctostaphylos regismontanaEastw., 1933 Arctos…

  لمعانٍ أخرى، طالع ويليام جاكسون (توضيح). هذه المقالة يتيمة إذ تصل إليها مقالات أخرى قليلة جدًا. فضلًا، ساعد بإضافة وصلة إليها في مقالات متعلقة بها. (مايو 2019) ويليام جاكسون معلومات شخصية الميلاد 13 سبتمبر 1897  الوفاة 4 أغسطس 1959 (61 سنة)   ملبورن  مواطنة أستراليا  الح

Official location where pieces of history have been preserved One of the best known historic sites in Europe, the ancient Roman city of Pompeii. A historic site or heritage site is an official location where pieces of political, military, cultural, or social history have been preserved due to their cultural heritage value. Historic sites are usually protected by law, and many have been recognized with official historic status. A historic site may be any building, landscape, site or structure tha…

Bryaceae Bryum elegans Klasifikasi ilmiah Kerajaan: Plantae Divisi: Bryophyta Kelas: Bryopsida Subkelas: Bryidae Ordo: Bryales Famili: Bryaceae Genus lihat teks Bryaceae adalah familia lumut daun. Genus dari familia ini diantaranya:[1] Acidodontium Anomobryum Brachymenium Bryum Leptostomopsis (dulunya Bryum) Mniobryoides Osculatia Perssonia Plagiobryum Ptychostomum (dulunya Bryum) Rhodobryum Roellia Rosulabryum Referensi ^ Buck, William R. & Bernard Goffinet. 2000. Morphology and cla…

Fadil JaidiFadil pada 2 Oktober 2021LahirFadil Muhammad Jaidi17 Oktober 1994 (umur 29)Bekasi, Jawa Barat, IndonesiaKebangsaanIndonesiaPendidikanLondon School of Public RelationsPekerjaanSelebriti internetaktorpengusahapenyanyiInformasi YouTubeKanal Fadil Jaidi PembuatFadil JaidiGenreVlogkomedihiburanPelanggan5.28 juta[1]Total tayang586.9 juta[1] Penghargaan Kreator 100.000 pelanggan 1.000.000 pelanggan Diperbarui: 13 Mei 2023 Fadil Muhammad Jaidi (lahir 17 Oktober 1994)…

Artikel ini sebatang kara, artinya tidak ada artikel lain yang memiliki pranala balik ke halaman ini.Bantulah menambah pranala ke artikel ini dari artikel yang berhubungan atau coba peralatan pencari pranala.Tag ini diberikan pada Maret 2016. SMK Negeri 1 BalonganSekolah Menengah Kejuruan Negeri 1 BalonganInformasiDidirikan30 November 2004JenisNegeriAkreditasiAKepala SekolahDrs. H. Komar, M.PdJurusan atau peminatanMulti Media Teknik Gambar dan Bangunan Teknik Komputer dan Jaringan Teknik Ke…

Tha Chalom Railway Halt Tha Chalom (Thai: ท่าฉลอม, pronounced [tʰâː t͡ɕʰā.lɔ̌ːm]) is a historic tambon (sub-district) of Mueang Samut Sakhon District, Samut Sakhon Province, central Thailand. Its name is also a name of surrounding area. History Tha Chalom has a history that goes back to the middle Ayutthaya period during the reign of King Maha Chakkraphat when he established Ban Tha Chin (บ้านท่าจีน) as a town in those days. Tha Chalom and Maha…

University of Notre Dame Australia (UNDA) adalah sebuah universitas swasta Katolik Roma yang didirikan tahun 1989 di kota pelabuhan Fremantle, Australia Barat 32°3′24″S 115°44′37″E / 32.05667°S 115.74361°E / -32.05667; 115.74361. University of Notre Dame Australia memiliki hubungan perguruan yang kuat[1] dengan American University of Notre Dame yang terletak di South Bend, Indiana, tetapi dua-duanya merupakan institusi terpisah. Sejumlah besar mahasisw…

Ponce de León IslandThe southern end of the island at Sebastian InletPonce de León IslandPonce de León IslandShow map of FloridaPonce de León IslandPonce de León Island (North Atlantic)Show map of North AtlanticGeographyLocationNorth AtlanticCoordinates28°04′06″N 80°33′37″W / 28.06833°N 80.56028°W / 28.06833; -80.56028Area55 km2 (21 sq mi)Length72.5 km (45.05 mi)Administration United StatesCountyBrevard County Ponce de León …

Artikel ini perlu dikembangkan agar dapat memenuhi kriteria sebagai entri Wikipedia.Bantulah untuk mengembangkan artikel ini. Jika tidak dikembangkan, artikel ini akan dihapus. Artikel ini tidak memiliki referensi atau sumber tepercaya sehingga isinya tidak bisa dipastikan. Tolong bantu perbaiki artikel ini dengan menambahkan referensi yang layak. Tulisan tanpa sumber dapat dipertanyakan dan dihapus sewaktu-waktu.Cari sumber: Divisi Tengah NBA – berita · surat kabar&#…

1999 studio album by Evan Parker, Steve Beresford, John Edwards, and Louis MoholoFoxes FoxStudio album by Evan Parker, Steve Beresford, John Edwards, and Louis MoholoReleased1999RecordedJuly 21, 1999StudioGateway Studios, LondonGenreFree improvisationLength1:17:40LabelEmanem4035 Foxes Fox is an album by saxophonist Evan Parker, pianist Steve Beresford, bassist John Edwards, and drummer Louis Moholo. It was recorded on July 21, 1999, at Gateway Studios in London, and was released later th…

Kembali kehalaman sebelumnya

Lokasi Pengunjung: 3.146.35.168