Skip to content

Instantly share code, notes, and snippets.

@thepian
Created January 29, 2012 16:08
Show Gist options
  • Select an option

  • Save thepian/1699453 to your computer and use it in GitHub Desktop.

Select an option

Save thepian/1699453 to your computer and use it in GitHub Desktop.
Better unicode
from __future__ import with_statement
import sys,os,yaml
def split_file(content,header):
if content[:3] == "---":
parts = content.split("---")
matter = yaml.load(unicode(parts[1],"utf-8")) or {}
for key in header.keys():
matter[key] = header[key]
encoding = 'utf-8'
if 'encoding' in matter:
encoding = matter['encoding']
rest = "---".join(parts[2:])
#print "parts", parts
return matter, unicode(rest,"utf-8")
return header, content
content = None
with open("test-unicode-matter.txt","rb") as f:
matter,content = split_file(f.read(),{})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment