4.5 Article

GATECloud.net: a platform for large-scale, open-source text processing on the cloud

Publisher

ROYAL SOC
DOI: 10.1098/rsta.2012.0071

Keywords

text mining; cloud computing; big data

Funding

  1. EPSRC/JISC grant [EP/I034092/1]
  2. Career Acceleration Fellowship from the Engineering and Physical Sciences Research Council [EP/I004327/1]
  3. Engineering and Physical Sciences Research Council [EP/I034092/1] Funding Source: researchfish
  4. EPSRC [EP/I034092/1, EP/I004327/1] Funding Source: UKRI

Ask authors/readers for more resources

Cloud computing is increasingly being regarded as a key enabler of the 'democratization of science', because on-demand, highly scalable cloud computing facilities enable researchers anywhere to carry out data-intensive experiments. In the context of natural language processing (NLP), algorithms tend to be complex, which makes their parallelization and deployment on cloud platforms a non-trivial task. This study presents a new, unique, cloud-based platform for large-scale NLP research-GATECloud. net. It enables researchers to carry out data-intensive NLP experiments by harnessing the vast, on-demand compute power of the Amazon cloud. Important infrastructural issues are dealt with by the platform, completely transparently for the researcher: load balancing, efficient data upload and storage, deployment on the virtual machines, security and fault tolerance. We also include a cost-benefit analysis and usage evaluation.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available