Monday, February 25, 2013

My First Crawler

import urllib2
from pyquery import PyQuery as pq

url = "http://api.jquery.com/category/selectors/"
result = urllib2.urlopen(url).read().decode("utf8")
q = pq(result)
result_of_bookmarks = q('a[rel="bookmark"]')
bookmarks = result_of_bookmarks.map(lambda i, e: pq(e).text())
'''
for item in bookmarks:
    print item.encode('utf8')

'''

result_of_sumaries = q('div[class="entry-summary"]')
sumaries = result_of_sumaries.map(lambda i, e: pq(e).text())
'''
for item in sumaries:
    print item.encode('utf8')
'''
Query = {}
for i in range(len(bookmarks)):
    Query[bookmarks[i]] = sumaries[i]

Sunday, February 24, 2013

Ubuntu Sublime Python Env


1) Change Environment Path
Find the file named "Python.Sublime-build":
{
    "cmd": ["python", "-u", "$file"],
  "file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python"
}
Change the Python path to yours in the Environment like:
{
"cmd": ["/home/hiddenghost/pyenv/bin/python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python"
}

2) Python Project Configuration Example
With the project name --> "*.sublime-project"
{
"folders":
    [
        {
            "path": "/home/hiddenghost/Documents/hiddenghost"
        }
    ],
    "build_systems":
    [
        {
       "name": "Run Tests",
        "env":
        {
        "PYTHONPATH": "/home/hiddenghost/pyenv/bin/python:/home/hiddenghost/pyenv/lib/python2.7/site-packages"
        },
       "working_dir": "/home/hiddenghost/Documents/hiddenghost",
       "cmd": ["/home/hiddenghost/pyenv/bin/python", "$file"],
       "selector": "source.python"
   }
    ]
}