Centers for Disease Control and Prevention COVID-19 translations
Language Pair and Version | Capture Date | Source Words | Source Words (Deduplicated) | Sample File |
---|---|---|---|---|
English-Spanish v1 | 2020-06-24 | 538,842 | 248,780 | TMX Sample |
English-Vietnamese v1 | 2020-06-24 | 550,066 | 249,006 | TMX Sample |
English-Korean v1 | 2020-06-24 | 537,204 | 262,393 | TMX Sample |
English-Chinese v1 | 2020-06-24 | 508,297 | 254,876 | TMX Sample |
These translation memories are made available under the Open Data Commons Attribution License. Individual contents of the database are in the public domain.
Source: CDC; Reference to specific commercial products, manufacturers, companies, or trademarks does not constitute its endorsement or recommendation by the U.S. Government, Department of Health and Human Services, or Centers for Disease Control and Prevention; The material is available on the agency website https://www.cdc.gov/ for no charge.
World Bank Open Knowledge Repository translations
Language Pair and Version | Capture Date | Source Words (Deduplicated) | Attribution |
---|---|---|---|
English-German v1 | 2020-11-23 | 20,804 | Attribution Page |
English-Spanish v1 | 2020-12-20 | 1,116,942 | Attribution Page |
English-French v1 | 2020-12-27 | 1,200,938 | Attribution Page |
English-Italian v1 | 2020-12-16 | 10,941 | Attribution Page |
English(United States)-Portuguese(Brazil) v1 | 2020-12-11 | 335,783 | Attribution Page |
English-Romanian v1 | 2020-12-16 | 6,347 | Attribution Page |
English-Indonesian v1 | 2020-11-23 | 69,488 | Attribution Page |
English-Chinese v1 | 2020-12-13 | 93,922 | Attribution Page |
English-Turkish v1 | 2020-12-13 | 37,148 | Attribution Page |
English-Ukrainian v1 | 2020-12-13 | 3,616 | Attribution Page |
English-Vietnamese v1 | 2020-12-14 | 43,431 | Attribution Page |
Web crawled translations of documents published in the World Bank Open Knowledge Repository. This data set was compiled by and is offered by Polyglot Technology LLC. Individual contents of the database are licensed CC BY 3.0 IGO or CC BY 4.0 by the World Bank.
Healthcare.gov translations
Language Pair and Version | Capture Date | Source Words | Source Words (Deduplicated) | Sample File |
---|---|---|---|---|
English-Spanish v1 | 2019-04-16 | 112,268 | TMX Sample |
This data set was compiled by and is offered by Polyglot Technology LLC. Individual contents of the data set are in the public domain.
U.S. Department of State News Releases
Language Pair and Version | Capture Date | Source Words | Source Words (Deduplicated) | Sample File |
---|---|---|---|---|
English-Arabic v2 | 2020-09-03 | 1,915,865 | ||
English-Spanish v2 | 2020-09-03 | 1,045,304 | ||
English-French v2 | 2020-09-03 | 1,099,530 | ||
English-Hindi v2 | 2020-09-03 | 693,624 | ||
English-Russian v2 | 2020-09-03 | 1,345,858 | ||
English-Urdu v2 | 2020-09-03 | 775,675 | ||
English-Chinese v2 | 2020-09-03 | 86,071 | ||
English(United States)-Portuguese(Brazil) v2 | 2020-09-03 | 806,593 | ||
English-Vietnamese v2 | 2020-09-03 | 8,918 | ||
English-Indonesian v2 | 2020-09-03 | 28,011 | ||
English-Persian v2 | 2020-09-03 | 15,979 |
Web crawled translations of press releases of the U.S. Department of State (February 2017 - October 2020). These data sets were compiled by and are offered by Polyglot Technology LLC. Individual contents of the data set are in the public domain.
Source: U.S. Department of State
Custom Crawling
We also offer customized parallel data crawling for the languages and domains you need. Please contact us at info@polyglot.technology.