Created
March 28, 2020 02:26
-
-
Save shaunthegeek/7f5c630093f1b97c29e627f2c8f09125 to your computer and use it in GitHub Desktop.
python 提取汉字和日期
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/python3 | |
| import re | |
| str1 = """美国股市(American stock market),简称美股,泛指美国纽约的证券交易所的股票市场,主要指标包括道琼斯工业平均指数、标准普尔500指数及纳斯达克100指数。狭义来说,美股主要是指道琼斯工业平均指数的表现,例如“美股升若干点”的意思等同于“道指升若干点”。 | |
| 2020年3月16日:美股开盘熔断,已经是本月第三次,美联储降息已经无力救市了吗?可能会带来哪些连锁反应? | |
| 2020年3月24日:美股开盘即大涨,是回光返照还是药到病除? | |
| 2020年3月25日:如何看待美股的暴涨? | |
| """ | |
| print (str1) | |
| res1 = ''.join(re.findall('[\u4e00-\u9fa5\n\d,。?:]',str1)) | |
| def format_date(matched): | |
| return matched.group('year') + '-' + matched.group('month') + '-' + matched.group('day') | |
| res2 = re.sub('(?P<year>\d+)年(?P<month>\d+)月(?P<day>\d+)日', format_date, res1) | |
| print ("处理结果:\n", res2) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment