Package translater. R topics documented: February 20, 2015. Type Package



Similar documents
Package retrosheet. April 13, 2015

Package pdfetch. R topics documented: July 19, 2015

Package TSfame. February 15, 2013

Package sjdbc. R topics documented: February 20, 2015

Package uptimerobot. October 22, 2015

Package R4CDISC. September 5, 2015

Package searchconsoler

Package sendmailr. February 20, 2015

Package OECD. R topics documented: January 17, Type Package Title Search and Extract Data from the OECD Version 0.2.

Package empiricalfdr.deseq2

Package cgdsr. August 27, 2015

Package acrm. R topics documented: February 19, 2015

Package RCassandra. R topics documented: February 19, Version Title R/Cassandra interface

Package pinnacle.api

Package tagcloud. R topics documented: July 3, 2015

Package EasyHTMLReport

Package fimport. February 19, 2015

Package whoapi. R topics documented: June 26, Type Package Title A 'Whoapi' API Client Version Date Author Oliver Keyes

Package erp.easy. September 26, 2015

Package MBA. February 19, Index 7. Canopy LIDAR data

Package plan. R topics documented: February 20, 2015

Package ChannelAttribution

Package urltools. October 11, 2015

Package hoarder. June 30, 2015

Package syuzhet. February 22, 2015

Package AzureML. August 15, 2015

Package dunn.test. January 6, 2016

Package RIGHT. March 30, 2015

Package httprequest. R topics documented: February 20, 2015

Package benford.analysis

Package xtal. December 29, 2015

Package GEOquery. August 18, 2015

Package pxr. February 20, 2015

The Enron Corpus: A New Dataset for Classification Research

Package missforest. February 20, 2015

Package MixGHD. June 26, 2015

Package bizdays. March 13, 2015

Package dsstatsclient

Package hive. January 10, 2011

Package treemap. February 15, 2013

Package HadoopStreaming

Package packrat. R topics documented: March 28, Type Package

Package WikipediR. January 13, 2016

Package GSA. R topics documented: February 19, 2015

Making SAP Information Steward a Key Part of Your Data Governance Strategy

Package survpresmooth

Package hier.part. February 20, Index 11. Goodness of Fit Measures for a Regression Hierarchy

Package neuralnet. February 20, 2015

Package hive. July 3, 2015

Package dsmodellingclient

Package bigrf. February 19, 2015

Package MDM. February 19, 2015

Package TRADER. February 10, 2016

Package PRISMA. February 15, 2013

Package brewdata. R topics documented: February 19, Type Package

Package optirum. December 31, 2015

Package hazus. February 20, 2015

Package LexisPlotR. R topics documented: April 4, Type Package

Package mpa. R topics documented: February 15, Type Package. Title CoWords Method. Version Date

Package RedditExtractoR

Package franc. R topics documented: November 12, Title Detect the Language of Text Version 1.1.1

Package HHG. July 14, 2015

Package klausur. February 20, 2015

Qlik REST Connector Installation and User Guide

Talend Component tgoogleanalyticsmanagement

Package PACVB. R topics documented: July 10, Type Package

Oracle Marketing Encyclopedia System

Package DSsim. September 25, 2015

Package DCG. R topics documented: June 8, Type Package

Package Rdsm. February 19, 2015

Package wordnet. January 6, 2016

Querying Combined Cloud-Based and Relational Databases

Package CRM. R topics documented: February 19, 2015

API Endpoint Methods NAME DESCRIPTION GET /api/v4/analytics/dashboard/ POST /api/v4/analytics/dashboard/ GET /api/v4/analytics/dashboard/{id}/ PUT

Package CIFsmry. July 10, Index 6

Cloud Computing for Education Workshop

Package bigdata. R topics documented: February 19, 2015

Package ATE. R topics documented: February 19, Type Package Title Inference for Average Treatment Effects using Covariate. balancing.

Package globe. August 29, 2016

There are various ways to find data using the Hennepin County GIS Open Data site:

Simba Apache Cassandra ODBC Driver

Package sudoku. February 20, 2015

Package instar. November 10, 2015

Package decompr. August 17, 2016

Stored Documents and the FileCabinet

Package png. February 20, 2015

RTD Documentation. =RTD( progid, server, [Field1], [Field2],...)

Developing Microsoft SharePoint Server 2013 Advanced Solutions

Homework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class

Transcription:

Type Package Package translater February 20, 2015 Title Bindings for the Google and Microsoft Translation APIs Version 1.0 Author Christopher Lucas and Dustin Tingley Maintainer Christopher Lucas <clucas@fas.harvard.edu> translater provides easy access to the Google and Microsoft APIs. The package is easy to use with the related R package ``stm'' for the estimation of multilingual topic models. Imports RJSONIO,RCurl,textcat,parallel,httr License GPL-3 NeedsCompilation no Repository CRAN Date/Publication 2014-07-09 08:21:30 R topics documented: enron............................................ 1 getgooglelanguages.................................... 3 getmicrosoftlanguages................................... 3 translate........................................... 4 Index 7 enron Small subset of Enron email corpus This data set was constructed from a very small subset of the Enron email corpus (Klimt & Yang, 2004). A large set of email messages was made public during the legal investigation concerning the Enron corporation. The full corpus contained 619,446 emails from 158 users. This data set contains only ten emails and includes the body of the email, the email s subject line, and the date. 1

2 enron data(enron) Format Source A data frame with 10 observations on the following 3 variables. email A character vector of the email s body. date The email s timestamp as a Date type. subject A character vector containing the email s subject line. Klimt, Bryan, and Yiming Yang. "The enron corpus: A new dataset for email classification research." In Machine learning: ECML 2004, pp. 217-226. Springer Berlin Heidelberg, 2004. ## Not run: # Load example data. Three columns, the text # content ( email ) and two metadata # fields (date and subject) data(enron) # Google, translate column in dataset # Google, translate vector # Microsoft, translate column in dataset # Microsoft, translate vector

getgooglelanguages 3 ## End(Not run) getgooglelanguages Print Google Language Codes This function prints the valid language Google language codes for use with the translate() function. getgooglelanguages() # print valid language codes getgooglelanguages() getmicrosoftlanguages Print Microsoft Language Codes This function prints the valid language Microsoft language codes for use with the translate() function. getmicrosoftlanguages() # print valid language codes getmicrosoftlanguages()

4 translate translate Translate text with the Google or the Microsoft translation APIs. This function provides easy access to the Google and Microsoft Translation APIs via R. It can translate any language supported by the APIs (to see a list of the available languages, see the getgooglelanguages() and getmicrosoftlanguages() functions). Text can be provided as either a column in a dataframe or as a single vector of text, where the elements are the documents to be translated. Translated text is returned in the format it was provided. If text is provided as a single vector, translate() returns a single vector of translated text. If a dataframe is provided, the user must specify which column contains the text that is to be translated. Translated text is then bound to the dataframe in a new column named "translatedcontent" and the entire dataframe is returned. The user must provide either a dataset and the content.field (column name) that contains the text to be translated, or a contect.vec (a character vector) where the elements are the text to be translated. translate(dataset = NULL, content.field = NULL, content.vec = NULL, google.api.key = NULL, microsoft.client.id = NULL, NULL, source.lang = NULL, target.lang = NULL) Arguments dataset content.field content.vec A dataframe with a column containing the text to be translated. If a dataframe is passed to "dataset", the name of the column containing the text must be passed to "content.field". A character vector of text. This is an alternative to "dataset"/"content.field". google.api.key To use the Google API, an API key must be provided. For more information on getting your key, see https://developers.google.com/translate/v2/getting_started. microsoft.client.id To use the Microsoft API, a client id and a client secret value must be provided. For more information on getting these, see http://msdn.microsoft.com/enus/library/hh454950.aspx. NOTE: you do not need to obtain an access token. translater will retrieve a token internally. microsoft.client.secret To use the Microsoft API, a client id and a client secret value must be provided. For more information on getting these, see http://msdn.microsoft.com/enus/library/hh454950.aspx. The client secret value is a unique identifying string obtained when registering with Microsoft (see the link for more information). NOTE: you do not need to obtain an access token. translater will retrieve a token internally.

translate 5 source.lang target.lang The language code that corresponds with the language in which the source text is written. The translators use different language codes, so select accordingly. To see a list of language codes, enter getgooglelanguages() or getmicrosoft- Languages() for Google or Microsoft, respectively. The language code that corresponds with the language into which the source text is to be translated. The translators use different language codes, so select accordingly. To see a list of language codes, enter getgooglelanguages() or getmicrosoftlanguages() for Google or Microsoft, respectively. Value translate returns a dataframe if it is passed a dataframe and a character vector if it is passed a character vector. ## Not run: # Load example data. Three columns, the text # content ( email ) and two metadata # fields (date and subject) data(enron) # Google, translate column in dataset # Google, translate vector # Microsoft, translate column in dataset # Microsoft, translate vector

6 translate ## End(Not run)

Index Topic datasets enron, 1 enron, 1 getgooglelanguages, 3 getmicrosoftlanguages, 3 translate, 4 7