jSonarR - Using Gateway data from within R

jSonarR is an R package that allows you to pull in data through the Gateway into R to do analysis within R and R Studio. You can invoke any API published through the gateway including queries and aggregation pipelines. You can bind variable values for the query/pipeline to use. Results from the Gateway are returned as R data frames and any package that works with data frames can then be used.

The steps to take when using jSonarR are:

  1. Build your query or pipeline using JSON Stuio and test it within the Studio.
  2. Save and/or publish your API for use within the Gateway.
  3. Test your API using the Gateway and/or using the Gateway form.
  4. Load the jSonarR package.
  5. Create a connection object to the Gateway
  6. Invoke the jSonarR functions to create a data frame within R.

Steps 1-3 are normal Gateway usage and do not differ when invoking them from R.

To load the jSonarR package:

install.packages("jSonarR")
library('jSonarR')

Now create a connection object using new.SonarConnection as follows:

con <- jSonarR::new.SonarConnection('https://<gateway host>:8443', '<database host>',
'companies', port=47017)

You can pass in all connection properties exposed by the Gateway such as username, password, whether to map credentials etc.

jSonarR exposes two functions you can call - jSonarR::sonarFind and jSonarR::sonarAgg - depending on whether you want to get data from a find query or an aggregation pipeline. Both have similar parameters and both create a data frame. For example, to call an aggregation pipeline by the name of ipo_funded_companies, call:

ipo <- jSonarR::sonarAgg(con, 'ipo_funded_companies', 'companies', idCol='ipo.stock_symbol')

This creates a new data frame acquired with the resulting data from running the pipeline through the Gateway.

The idCol parameter is used when you want to tell jSonarR which returned column should be used as the row identifier of the data frame. If you do not specify this the rows are names using an ordinal. For example, the above data from looks like:

> ipo
                         X_id.permalink     raised rounds empnum founded_year founded_month founded_day
NYSE:TXTR                       textura   14250000      2     NA           NA            NA          NA
SODA                         sodastream    9300000      1     NA         1903            NA          NA
NASDAQ:PETX        aratana-therapeutics   76750000      4     NA           NA            NA          NA
NYSE:CRM                     salesforce   99916337      4   3500         1999            NA          NA
OMER                             omeros   63000000      1     NA           NA            NA          NA
NASDAQ:GAME               shanda-games   40000000      1     NA           NA            NA          NA
NASDAQ:TSRO                      tesaro  252000000      3     NA         2010             3          NA
NASDAQ:MERU               meru-networks  163600000      6     NA         2002            NA          NA
NASDAQ:JIVE               jive-software   69426794      5     50         2001             2           7
...

whereas the dataframe produced using ipo2 <- jSonarR::sonarAgg(con, ‘ipo_funded_companies’, ‘companies’) looks like:

> ipo2
               X_id.permalink     raised rounds empnum founded_year founded_month founded_day
1                     textura   14250000      2     NA           NA            NA          NA
2                  sodastream    9300000      1     NA         1903            NA          NA
3        aratana-therapeutics   76750000      4     NA           NA            NA          NA
4                  salesforce   99916337      4   3500         1999            NA          NA
5                      omeros   63000000      1     NA           NA            NA          NA
6                shanda-games   40000000      1     NA           NA            NA          NA
7                      tesaro  252000000      3     NA         2010             3          NA
8               meru-networks  163600000      6     NA         2002            NA          NA
9               jive-software   69426794      5     50         2001             2           7
...

Bind variables can be used when making an API call. For example, if the query has a bind variable named $$number_of_employees you can bind a value when making the call using:

large <- jSonarR::sonarFind(con, 'find1', 'companies', bind=list(number_of_employees=6000), idCol='_id')

Finally, if you need explicit type conversions (e.g. creating dates or numerics) use the colClasses parameters. In the following example the data frame creates

makes sure that the values for X_avg_AirTime, X_avg_ArrDelay and X_avg_DepDelay are numerics and not factors:

nyc_by_day <- jSonarR::sonarAgg(con2, 'delays_by_day', 'NYCFlights',
        colClasses=c(X_avg_AirTime='numeric', X_avg_ArrDelay='numeric',X_avg_DepDelay='numeric'))

For more information on the jSonarR package and to download the PDF manual see http://cran.r-project.org/web/packages/jSonarR/index.html

Table Of Contents

Previous topic

JSON Data Pump (jPump)

Next topic

Working with Aliases

Copyright © 2013-2016 jSonar, Inc
MongoDB is a registered trademark of MongoDB Inc. Excel is a trademark of Microsoft Inc. JSON Studio is a registered trademark of jSonar Inc. All trademarks and service marks are the property of their respective owners.