Last active
August 16, 2016 00:14
-
-
Save saidie/450fdcd7658acfdfcbc5 to your computer and use it in GitHub Desktop.
Convert Pocket export file for org-mode
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| source "https://rubygems.org" | |
| gem "nokogiri" | |
| gem "open_uri_redirections" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| GEM | |
| remote: https://rubygems.org/ | |
| specs: | |
| mini_portile2 (2.0.0) | |
| nokogiri (1.6.7.2) | |
| mini_portile2 (~> 2.0.0.rc2) | |
| open_uri_redirections (0.2.1) | |
| PLATFORMS | |
| ruby | |
| DEPENDENCIES | |
| nokogiri | |
| open_uri_redirections | |
| BUNDLED WITH | |
| 1.11.2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| require 'open-uri' | |
| require 'open_uri_redirections' | |
| require 'nokogiri' | |
| def parse_entry(entry) | |
| title = entry.text | |
| link = entry['href'] | |
| added = Time.at(entry['time_added'].to_i).strftime('%Y-%m-%d %a %H:%M') | |
| tags = if entry['tags'].empty? | |
| [] | |
| else | |
| entry['tags'].split(',') | |
| end | |
| [title, link, added, tags] | |
| end | |
| def output_entry(entry) | |
| title, link, added, tags = parse_entry(entry) | |
| tags << '@READING' | |
| if title == link | |
| STDERR.puts "Load title... #{link}" | |
| begin | |
| html = open(link, allow_redirections: :all) | |
| doc = Nokogiri::HTML(html) | |
| title = doc.xpath('/html/head/title').text.strip | |
| link = html.base_uri.to_s | |
| rescue SocketError | |
| end | |
| end | |
| puts <<ENTRY | |
| ** #{title} :#{tags.join(':')}: | |
| #{link} | |
| [#{added}] | |
| ENTRY | |
| end | |
| html = STDIN.read | |
| doc = Nokogiri::HTML.parse(html, nil, 'utf-8') | |
| unread, read = doc.xpath('//ul') | |
| puts '* Unread' | |
| unread.xpath('li/a').each { |entry| output_entry(entry) } | |
| puts '* Read' | |
| read.xpath('li/a').each { |entry| output_entry(entry) } |
Author
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Usage
% ruby org-ril-import.rb < ril-export.html