python - How to count the occurrences of keywords in code but ignore the ones in comment / docstring? -


i quite new in python. want find occurrences of python keywords ['def','in', 'if'...] in code below here. however, keywords found in string constants in code need ignored. how can count keywords occurrences without counting ones in string?

def grade(result):     '''     if if (<--- example test if word "if" ignored in counts)     :param result: none     :return:none     '''      if result >= 80:         grade = "hd"     elif 70 <= result:         grade = "di"     elif 60 <= result:         grade = "cr"     elif 50 <= result:         grade = "pa"     else:     #else (ignore word)         grade = "nn"     return grade  result = float(raw_input("enter final result: "))  while result < 0 or result > 100:     print "invalid result. result must between 0 , 100."     result = float(raw_input("re-enter final result: "))  print "the corresponding grade is", grade(result) 

use tokenize, keyword , collections modules.

tokenize.generate_tokens(readline)

the generate_tokens() generator requires 1 argument, readline, must callable object provides same interface readline() method of built-in file objects (see section file objects). each call function should return 1 line of input string. alternately, readline may callable object signals completion raising stopiteration.

the generator produces 5-tuples these members: the token type; token string; 2-tuple (srow, scol) of ints specifying row , column token begins in source; 2-tuple (erow, ecol) of ints specifying row , column token ends in source; , line on token found. line passed (the last tuple item) logical line; continuation lines included.

new in version 2.2.

import tokenize open('source.py') f:     print list(tokenize.generate_tokens(f.readline)) 

partial output:

[(1, 'def', (1, 0), (1, 3), 'def grade(result):\n'),  (1, 'grade', (1, 4), (1, 9), 'def grade(result):\n'),  (51, '(', (1, 9), (1, 10), 'def grade(result):\n'),  (1, 'result', (1, 10), (1, 16), 'def grade(result):\n'),  (51, ')', (1, 16), (1, 17), 'def grade(result):\n'),  (51, ':', (1, 17), (1, 18), 'def grade(result):\n'),  (4, '\n', (1, 18), (1, 19), 'def grade(result):\n'),  (5, '    ', (2, 0), (2, 4), "    '''\n"),  (3,   '\'\'\'\n    if if (<--- example test if word "if" ignored in counts)\n    :param result: none\n    :return:none\n    \'\'\'',   (2, 4),   (6, 7),   '    \'\'\'\n    if if (<--- example test if word "if" ignored in counts)\n    :param result: none\n    :return:none\n    \'\'\'\n'),  (4, '\n', (6, 7), (6, 8), "    '''\n"),  (54, '\n', (7, 0), (7, 1), '\n'),  (1, 'if', (8, 4), (8, 6), '    if result >= 80:\n'), 

you may retrieve list of keywords form keyword module:

import keyword print keyword.kwlist print keyword.iskeyword('def') 

integrated solution collections.counter:

import tokenize import keyword import collections  open('source.py') f:     # tokens lazy generator     tokens = (token _, token, _, _, _ in tokenize.generate_tokens(f.readline))     c = collections.counter(token token in tokens if keyword.iskeyword(token))  print c  # counter({'elif': 3, 'print': 2, 'return': 1, 'else': 1, 'while': 1, 'or': 1, 'def': 1, 'if': 1}) 

Comments

Popular posts from this blog

apache - PHP Soap issue while content length is larger -

asynchronous - Python asyncio task got bad yield -

javascript - Complete OpenIDConnect auth when requesting via Ajax -