To begin, make sure civis
is installed and your API key is in your R environment. You can quickly test that civis
is working by invoking
If civis
is working, you’ll see a friendly message. Otherwise, you might see an error like this when civis
wasn’t installed properly:
Error in loadNamespace(name) : there is no package called 'civis'
or like this if you haven’t set your API key correctly:
Error in api_key() : The environmental variable CIVIS_API_KEY is not set. Add this to your .Renviron or call Sys.setenv(CIVIS_API_KEY = '<api_key>')
With civis
, moving data to and from the cloud takes only a few lines of code. Your data can be stored as rows in a table, CSVs on remote file system or even as serialized R objects like nested lists. For example,
library(civis)
# First we'll load a dataframe of the famous iris dataset
data(iris)
# We'll set a default database and define the table where want to
# store our data
options(civis.default_db = "my_database")
iris_tablename <- "my_schema.my_table"
# Next we'll push the data to the database table
write_civis(iris, iris_tablename)
# Great, now let's read it back
df <- read_civis(iris_tablename)
# Hmmm, I'm more partial to setosa myself. Let's write a custom sql query.
# We'll need to wrap our query string in `sql` to let read_civis know we
# are passing in a sql command rather than a tablename.
query_str <- paste("SELECT * FROM", iris_tablename, "WHERE Species = 'setosa'")
iris_setosa <- read_civis(sql(query_str))
# Now let's store this data along with a note as a serialized R object
# on a remote file system. We could store any object remotely this way.
data <- list(data = iris_setosa, special_note = "The best iris species")
file_id <- write_civis_file(data)
# Finally, let's read back our data from the remote file system.
data2 <- read_civis(file_id)
data2[["special_note"]]
## [1] "The best iris species"
civis
also includes functionality for working with CivisML, Civis’ machine learning ecosystem. With the combined power of CivisML and civis
, you can build models in the cloud where the models can use as much memory as they need and there’s no chance of your laptop crashing.
library(civis)
# It really is a great dataset
data(iris)
# Gradient boosting or random forest, who will win?
gb_model <- civis_ml_gradient_boosting_classifier(iris, "Species")
rf_model <- civis_ml_random_forest_classifier(iris, "Species")
macroavgs <- list(gb_model = gb_model$metrics$metrics$roc_auc_macroavg,
rf_model = rf_model$metrics$metrics$roc_auc_macroavg)
macroavgs
## $gb_model
## [1] 0.9945333
##
## $rf_model
## [1] 0.9954667
For a comprehensive list of functions in civis
, see Reference in the full documentation. The full documentation also includes a set of Articles
for detailed documentation on common workflows, including data manipulation and building models in parallel.