Skip to content

Instantly share code, notes, and snippets.

@cnuernber
Created July 27, 2020 22:38
Show Gist options
  • Select an option

  • Save cnuernber/20791011f267d29aff6a29b93b0f8527 to your computer and use it in GitHub Desktop.

Select an option

Save cnuernber/20791011f267d29aff6a29b93b0f8527 to your computer and use it in GitHub Desktop.

Pure parse time of 94MB file (ignoring mmaping the file)

tech.libs.arrow.message>   (mapv parse-message messages)
[{:fields
  [{:name "symbol",
    :nullable? true,
    :field-type {:datatype :string, :encoding :utf-8},
    :metadata
    {":name" "\"symbol\"", ":size" "5600000", ":datatype" ":string", ":categorical?" "true"},
    :dictionary-encoding
    {:id -887523944, :ordered? false, :index-type {:datatype :int8}}}
   {:name "date",
    :nullable? false,
    :field-type {:datatype :epoch-milliseconds, :timezone "UTC"},
    :metadata
    {":name" "\"date\"", ":timezone" "\"UTC\"", ":source-datatype" ":packed-local-date", ":size" "5600000", ":datatype" ":epoch-milliseconds"}}
   {:name "price",
    :nullable? false,
    :field-type {:datatype :float64},
    :metadata {":name" "\"price\"", ":size" "5600000", ":datatype" ":float64"}}],
  :metadata {}}
 {:id -887523944,
  :isDelta false,
  :records
  {:nodes [{:n-elems 6, :n-null-entries 0}],
   :buffers
   [{:address 139981409698888, :n-elems 1, :datatype :int8}
    {:address 139981409698896, :n-elems 28, :datatype :int8}
    {:address 139981409698928, :n-elems 19, :datatype :int8}]}}
 {:nodes
  [{:n-elems 5600000, :n-null-entries 0}
   {:n-elems 5600000, :n-null-entries 0}
   {:n-elems 5600000, :n-null-entries 0}],
  :buffers
  [{:address 139981409699192, :n-elems 700000, :datatype :int8}
   {:address 139981410399192, :n-elems 5600000, :datatype :int8}
   {:address 139981415999192, :n-elems 700000, :datatype :int8}
   {:address 139981416699192, :n-elems 44800000, :datatype :int8}
   {:address 139981461499192, :n-elems 700000, :datatype :int8}
   {:address 139981462199192, :n-elems 44800000, :datatype :int8}]}]
tech.libs.arrow.message> (require '[criterium.core :as crit])
nil
tech.libs.arrow.message> (crit/quick-bench (mapv parse-message (message-seq fdata)))
Evaluation count : 34188 in 6 samples of 5698 calls.
             Execution time mean : 18.144137 µs
    Execution time std-deviation : 366.879307 ns
   Execution time lower quantile : 17.795106 µs ( 2.5%)
   Execution time upper quantile : 18.632347 µs (97.5%)
                   Overhead used : 2.640179 ns
nil
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment