Romanian - English news corpus (Processed)

Romanian – English news corpus was created for the European Language Resources Coordination Action (ELRC) (http://lr-coordination.eu/) by Tufis Dan, Institutul de Cercetari pentru Inteligenta Artificiala ”Mihai Draganescu”, Academia Romana (www.racai.ro/) with primary data copyrighted by SouthEast European Times and is licensed under "CC-BY 4.0" (https://creativecommons.org/licenses/by/4.0/).

Bilingual Romanian – English news corpus built from SouthEast European Times (2008 dump). The texts are positionaly aligned, i.e. the sentence on line i in the English text is aligned with the sentence on line i in the Romanian text. Alignment was manually validated.