python - How to count the occurrences of keywords in code but ignore the ones in comment / docstring? -
i quite new in python. want find occurrences of python keywords ['def','in', 'if'...]
in code below here. however, keywords found in string constants in code need ignored. how can count keywords occurrences without counting ones in string?
def grade(result): ''' if if (<--- example test if word "if" ignored in counts) :param result: none :return:none ''' if result >= 80: grade = "hd" elif 70 <= result: grade = "di" elif 60 <= result: grade = "cr" elif 50 <= result: grade = "pa" else: #else (ignore word) grade = "nn" return grade result = float(raw_input("enter final result: ")) while result < 0 or result > 100: print "invalid result. result must between 0 , 100." result = float(raw_input("re-enter final result: ")) print "the corresponding grade is", grade(result)
use tokenize
, keyword
, collections
modules.
tokenize.generate_tokens(readline)
the generate_tokens() generator requires 1 argument, readline, must callable object provides same interface readline() method of built-in file objects (see section file objects). each call function should return 1 line of input string. alternately, readline may callable object signals completion raising stopiteration.
the generator produces 5-tuples these members: the token type; token string; 2-tuple (srow, scol) of ints specifying row , column token begins in source; 2-tuple (erow, ecol) of ints specifying row , column token ends in source; , line on token found. line passed (the last tuple item) logical line; continuation lines included.
new in version 2.2.
import tokenize open('source.py') f: print list(tokenize.generate_tokens(f.readline))
partial output:
[(1, 'def', (1, 0), (1, 3), 'def grade(result):\n'), (1, 'grade', (1, 4), (1, 9), 'def grade(result):\n'), (51, '(', (1, 9), (1, 10), 'def grade(result):\n'), (1, 'result', (1, 10), (1, 16), 'def grade(result):\n'), (51, ')', (1, 16), (1, 17), 'def grade(result):\n'), (51, ':', (1, 17), (1, 18), 'def grade(result):\n'), (4, '\n', (1, 18), (1, 19), 'def grade(result):\n'), (5, ' ', (2, 0), (2, 4), " '''\n"), (3, '\'\'\'\n if if (<--- example test if word "if" ignored in counts)\n :param result: none\n :return:none\n \'\'\'', (2, 4), (6, 7), ' \'\'\'\n if if (<--- example test if word "if" ignored in counts)\n :param result: none\n :return:none\n \'\'\'\n'), (4, '\n', (6, 7), (6, 8), " '''\n"), (54, '\n', (7, 0), (7, 1), '\n'), (1, 'if', (8, 4), (8, 6), ' if result >= 80:\n'),
you may retrieve list of keywords form keyword
module:
import keyword print keyword.kwlist print keyword.iskeyword('def')
integrated solution collections.counter:
import tokenize import keyword import collections open('source.py') f: # tokens lazy generator tokens = (token _, token, _, _, _ in tokenize.generate_tokens(f.readline)) c = collections.counter(token token in tokens if keyword.iskeyword(token)) print c # counter({'elif': 3, 'print': 2, 'return': 1, 'else': 1, 'while': 1, 'or': 1, 'def': 1, 'if': 1})
Comments
Post a Comment