Max Haas wrote: > Problem: 738 files with Latin text. Every file represents a singular source. > Find every match of a term. (I need e.g. structura, structuram, structurae > etc.) Sounds like we're helping with your homework. I'll resist the temptation to post too much code for this... :-) > The program: > > 1. Enter the question and x and y. The question will then be the compiled > object p. > 2. Read every file in (something like fp.readlines()). > 3. Transform the file to a string (string.join(list_of_file)). > 4. m = p.findall(string). If m is not None then: > a. Give the file contents (lines 10-15) > b. look for the occurrence of every word in m and note the position (with > string.find) > c. Give x words before the matched word, the matched word and then y words > after > ... > > The main problem for me is: do I understand correctly the function of > p.findall(string) in combination with string.find? Better would be a loop using 'p.search()', which would find one occurrence at a time. The match object returned by this function has attributes 'pos' and 'endpos' which would let you locate the matched word in the string containing the file contents. -Steve -- Steve Purcell, Pythangelist Get testing at http://pyunit.sourceforge.net/ Any opinions expressed herein are my own and not necessarily those of Yahoo
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4