Skip to content

Instantly share code, notes, and snippets.

@ChengzhiZhao
Created August 4, 2020 17:39
Show Gist options
  • Select an option

  • Save ChengzhiZhao/dcd81a83e663d693d9cd0f2fe4399a3e to your computer and use it in GitHub Desktop.

Select an option

Save ChengzhiZhao/dcd81a83e663d693d9cd0f2fe4399a3e to your computer and use it in GitHub Desktop.
DataFusion Select Statement 1.0.0
use arrow::util::pretty;
use std::time::{Duration, Instant};
use datafusion::datasource::csv::CsvReadOptions;
use datafusion::error::Result;
use datafusion::execution::context::ExecutionContext;
/// This example demonstrates executing a simple query against an Arrow data source (CSV) and
/// fetching results
fn main() -> Result<()> {
let start = Instant::now();
// create local execution context
let mut ctx = ExecutionContext::new();
// register csv file with the execution context
ctx.register_csv(
"ratings",
"/datafusion_test/ratings.csv",
CsvReadOptions::new(),
);
let sql = "SELECT userId, AVG(rating) FROM ratings GROUP BY userId";
// create the query plan
let plan = ctx.create_logical_plan(sql)?;
let plan = ctx.optimize(&plan)?;
let plan = ctx.create_physical_plan(&plan, 1024 * 1024)?;
// execute the query
let results = ctx.collect(plan.as_ref())?;
let duration = start.elapsed();
println!("Time elapsed in SQL() is: {:?}", duration);
// print the results
pretty::print_batches(&results)?;
Ok(())
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment