As a result of the Omnivore app shutting down, lots of people will have exported files to process to import into other read-it-later and bookmarking apps. I chose Raindrop.io but there are other alternatives.
(originally posted here omnivore-app/omnivore#4461)
For anyone trying to process their Omnivore export into something more suitable for import into Raindrop.io, etc, Omnivore suggests using a jq command to convert your files, but it doesn't work very well.
Once you've installed jq, here's a command which creates a nice CSV out of your metadata_*.json files from your Omnivore export, including extracting your tags and cleaning up any non-printable characters in your title and description fields:
jq -r '
(["url","title","note","tags","created"]),
(.[] | [
.url,
(.title | gsub("\\n";" ") | gsub("\\r";" ") | gsub("\"";"''") | gsub("[^[:print:]]";" ") | gsub("\\s+";" ")),
(.description | gsub("\\n";" ") | gsub("\\r";" ") | gsub("\"";"''") | gsub("[^[:print:]]";" ") | gsub("\\s+";" ")),
([.labels[]?]|join(",")),
.savedAt
]) | @csv
' metadata_*.json > omnivore-export.csv
Once you've unzipped your Omnivore export .zip file, change into the directory where the metadata_*.json files are, and then you should be able to paste this command into your shell and run it (works for me in zsh and should work in bash as well).
For your edification, learning and enjoyment, here's a detailed breakdown of what this jq command does:
- First, it outputs a header row with column names:
["url","title","note","tags","created"] - Then for each object in the JSON array (
.[]), it creates an array with these transformations:.url- Grabs the URL- For the
.titlefield, it applies these cleanups in sequence:gsub("\\n";" ")- Replaces newlines with spacesgsub("\\r";" ")- Replaces carriage returns with spacesgsub("\"";"''")- Replaces double quotes with two single quotesgsub("[^[:print:]]";" ")- Replaces any non-printable characters with spacesgsub("\\s+";" ")- Collapses multiple spaces into single spaces
- Applies the same cleanups to the
.descriptionfield [.labels[]?]|join(",")- Takes the labels array and joins it into a single comma-separated string.savedAt- Grabs the timestamp
- Finally,
| @csvformats everything as proper CSV, automatically:- Adding double quotes around fields that need them
- Adding commas between fields
- Creating proper line endings