hrenWaC 2.0 Croatian-English Parallel Corpus

hrenWaC 2.0 Croatian-English Parallel Corpus by Nikola Ljubešić available for use of DGT for eTranslation development with permission from corpus author.

hrenWaC 2.0 Croatian-English Parallel Corpus contains documents in the general domain, totaling 1,554,912 sentence pairs. The corpus contains texts crawled from the .hr top-level domain for Croatia. The corpus was built with Spidextor (https://github.com/abumatran/spidextor) with the accuracy ofRead More